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ABSTRACT 


Repair and replacement of cast iron water mains is a significant issue for many water 
utilities. Having the ability to predict the frequency of water main breaks gives utilities 
improved information for decision-making. In this study Artificial Neural Network 
(ANN) methodology was used to predict pipe break frequency, using historical pipe 


break data from a city subdivision. 


The application of this modeling methodology is ideal due to its ability to use readily 
available meteorological and operational data. Due to the open-system nature of this 
study, extra care was taken to ensure use of reliable data. Having developed proven 

results, the developed ANN models could then be used to infer factors affecting pipe 


breaks, and to develop mitigation techniques. 
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10 INTRODUCTION 


In many urban areas, the frequency of cast iron water main pipe breaks has increased 
with time. The unavoidable question of how to deal with aging water distribution 
infrastructure has developed into issues of total costs associated with repair of pipe 
breaks (and consequent interruptions in water delivery service), and replacement of 
deteriorated water mains. These pipe breaks are viewed with greater seriousness than in 
previous times by the water utilities, from both an economic and a customer relation’s 
standpoint. As a result, many utilities have sought methods for predicting these events to 
allow for proactive replacement of the deteriorated pipes, thus reducing the burden of 
emergency repairs. A secondary benefit of developing a rational model is to determine 
the prevailing nature of these failures, allowing for pipe break mitigative techniques 


where pipe replacement is not warranted. 


The City of Edmonton’s water utility, Aqualta, has sponsored the investigation of the use 
of Artificial Neural Networks (ANN) modeling to assist in the prediction of cast iron pipe 
break trends for city subdivisions. The purpose of the ANN model is to identify areas 
that will have a higher cumulative probability ne cast iron distribution pipe failures, 
which necessitates the replacement of these problem water mains. To easily facilitate the 
use of the ANN models by the water utility, the information used for input parameters 
must be readily accessible due to the need for large quantities of historical data. Because 
there is a lack of available algorithms which describe the pipe break process, the 


Artificial Neural Networks modeling methodology is recommended. 
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1.1 Problem Statement 


The purpose of this study was to investigate the feasibility of developing an Artificial 
Neural Network model to be used as a general screening tool. The expectation of this 
exercise was to confirm the potential utility of using the ANN methodology for modeling 
cast iron water pipe failures, accurately predicting the number of pipe breaks within a 
defined area for the purpose of determining the area’s pipe break density. This 
information would be used as a criterion for the Cast Iron Renewal Program. This 
program advocates proactive replacement of the pipes in areas with the highest break 
frequency. The Calder subdivision within the City of Edmonton was chosen as the study 
area. For this study, the scope of the modeling was limited to 150 mm cast iron pipe. 


This was done largely due to availability of data. 


Development of ANN models using this type of methodology requires the use of 
historically collected data. This data must be meaningful and easily accessible to prove 
model credibility and permit implementation of the model. Therefore, effectiveness of 
these models hinges on obtaining appropriate ine Having accomplished this collection 
task, the finished models would intrinsically capture the cause-effect logic of the pipe 

_ failure mechanisms. This would allow inferences to be made of the predominating pipe 


failure modes. 
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The study undertaken is of particular interest to cities in cold weather climates. Given 
that the majority of pipe failures causes are related to cold weather, the study is particular 


to areas where winter seasons involve sub-zero temperatures. 
1.2 Water Main Replacement Overview 


Aqualta’s Engineering Department (formerly the City of Edmonton Water Branch, Water 
Network Engineering Section) is responsible for the maintenance of the water 
distribution infrastructure. Its water main replacement program is appropriately named 
the Cast Iron Renewal Program. The program’s operating philosophy is based on the 
concept that the area with the highest cast iron pipe break density represents the greatest 
threat to service disruption, and therefore is the greatest priority for water main 
replacement. Therefore, the cost effectiveness of a selected renewal project is directly 
proportional to its failure frequency (i.e. replacement of high failure frequency mains will 


remove more potential failures from the system per dollar spent). 


Studies show that a high percentage of the failures in the cast iron system tend to occur in 
a relatively small percentage (by area) of the cis water distribution system. These high 
frequency areas are normally designated for renewal based on a predetermined “critical 

_ failure frequency”, in an effort to remove what is deemed to be the worst pipes from the 
system. This “critical failure frequency” is defined as a point above which it is more 
economical to replace the pipe, but below which it is more economical to repair pipe. In 


the past several years this critical frequency has been set at 5.0 failures/km/year 
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(O¥arrell, 1995), taken over a five year moving-average. The idea behind the process is 
to identify areas with consistently high historical break frequencies, and to replace these 
potential problem areas pro-actively. Funds are therefore allocated to replace a certain 


percentage of these areas per year. 


The program has improved the service level by reducing the overall annual failure 
frequency from slightly over 1.0 failures/km/year in the mid-1980’s, to 0.8 


failures/km/year, ten years later—a 20 percent improvement. 


With the program progressing towards its goal of replacing all pipe break densities equal 
to or greater than the five-year, moving average break density, new methods of 
maintaining flexibility in decision-making are required. The ability to accurately predict 


actual pipe break occurrences based on existing data is therefore advantageous. 
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2.0 BACKGROUND INFORMATION 


2.1 Pipe Failure Modes 


Statistics describing the performance of water mains are typically expressed in terms of a 
frequency of breaks. Breaks are defined as events leading to the disruption of water main 


service per kilometer per year. 


Goulter and Kazemi presented two papers describing a study of the break frequency for 
Winnipeg, Manitoba [(Goulter and Kazemi, 1988) and (Goulter and Kazemi, 1989)]. 
This study is considered to be typical of many cities in North America and, therefore, 


will be used as a reference point. 


Studies in several Canadian and U.S. cities have shown that water main breaks can occur 


in various modes of failure. These failure types include: 


1. Circumferential failures (also referred to as circular or transverse); 
2. Longitudinal split failures (includes diagonal); 

3. Pinhole failures due to corrosion; 

4. Pipe joint leaks (including fitting leaks) and; 


5. Clamp failures. 
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These most common types of failures, circumferential and longitudinal split failures, are 


illustrated in Figure 1. 


hermal expansion or 


Increase in normal | 
due to frost \t 
Frictional 


p. LO 
44 t Circular 


Tensile or 
stress 


Surge of 
pressure 


Hoop stress 


Figure 1. Circular and longitudinal split failure modes of water mains 
Adapted from (Rajani et al., 1996) 


Although modal failure statistics vary per city, an average of 70 percent of water main 
failures are circumferential failures, with the remaining 30 percent being shared by the 


other types of failures (Rajani et al., 1996). 


Circumferential failures are caused by longitudinal tensile stresses or from flexural 
(bending) stress. Longitudinal split failures are the result of circumferential (hoop) 
stress. Corrosion can also be the direct or indirect cause of pipe failures. These types of 
failures may be the result of a “blowout” (whereby a surge pressure causes the corrosion- 
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thinned wall to fail), or may be caused by complete wall corrosion. Joint failures can be 
the result of pipe jacking (heaving) or mis-aligned connections (Milligan, 1995). Clamp 
failures occur when clamps (used to repair previous pipe failures) themselves fail. In 

some cases, a combination of the above failures can occur. Therefore, these failures are 


usually the result of physical characteristics and environmental factors interacting. 


wie Failure Mechanisms 


Theories explaining the failure mechanisms of cast iron water mains in cold climate 
regions have advanced over the past two decades. Pipe failure modes have been 
identified, and the mechanisms thought to cause such failures have been studied. The 
mechanisms that predominate such discussions revolve around frost heave, soil-pipeline 


interaction, pipeline operating conditions and corrosion (both internal and external of the 


pipe). 


The above mechanisms have been identified as likely causes for the different modes of 
failure of water mains. Circumferential pipe failures may be caused by excessive flexural 
or axial stresses (Habibian, 1994). Flexural stretces are thought to be the result of frost 
heave mechanisms (differential heave, beam-type loading). Axial stresses are caused by 
soil-pipeline frictional resistance opposing pipe shrinkage (brought on by sudden, 
extreme temperature drops). Longitudinal failures are thought to be caused by a high 
temperature gradient across the pipe wall, generating high hoop stresses (Habibian, 


1994). Corrosion failures are thought to contribute both directly or indirectly to the 
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majority of all water main failures (Kettler and Goulter, 1985), and both internal and 
external pipe corrosion has been examined (Morris Jr., 1967). Pipe joint failures are also 


thought to be caused by frost heaving, or due to improper installation (Milligan, 1995). 


DAD | Frost Heave 


The frost heave mechanism has been a popular topic with regards to many types of 
structures. Almost any underground structure requires the consideration of frost heave 
effects that may displace portions or the entire underground structure. Frost heave is 
defined as the vertical expansion of soils caused by freezing of the soil and ice lens 


formation. 


Differential heave causes sections of pipe to experience non-uniform displacements, and 
this differential results in forceful flexural stresses (Figure 2). Uniform heaving may also 
prove to be a problem under certain circumstances where pipe joints are not subject to 
movement. Under this scenario, the pipe experiences stresses similar to a simple beam 
loading, in which case the pipe will experience bending stresses. Failure of pipe joints 
may be the result of the frost heave process (i.e. pipe jacking) or due to illicit connections 
(Milligan, 1995). This may be a function of the type of connection, and the type of fill 


- material used between joints. 
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FROZEN (STABLE) 


Uplift resistance from frozen soil 


UNFROZEN (HEAVING) 


Heave exerted by frozen soil 


Figure 2. Differential frost heave effects on pipelines. 
Adapted from (Nixon, 1994) 


The principles of frost heave mechanics are well known in theory. Conditions for frost 


heave require the following (Anderson et al., 1984): 


1. The presence of a frost susceptible soil; 

2. The presence of a sufficient water source, whether it is capillary or a ground water 
source (for lens formation) and; 

3. A ground temperature below zero degrees Celsius. Some argue that it is the change 
in temperature that is more important (Bahmanyar and Edil, 1983); others have 
argued that it is the difference between the prevailing temperature from the average 


temperature (Bates et al., 1996). 


With all of the above factors present, there is the potential for damage due to frost heave. 


The propensity for heave of a soil under freezing conditions is affected by properties 
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such as grain size, rate of freezing, the availability of water, and by applied loads 


(Konrad, 1987). 


2.2.2 Soil-Pipeline Interaction 


Soil-pipeline interactions are also a possible cause of pipe failures. The resistance of the 
soil-to-pipeline union is important because the shear strength of the interaction can affect 
the degree of mobility of the pipeline and hence its ability to displace. In cold 
temperatures, the bond between the soil and pipe indicates the amount of restraint the 
pipe is allowed to shrink axially. A high soil-pipeline interaction will not allow the pipe 
to contract, and consequently the axial stress in the pipe will increase. It is also possible 
that a strong bond between the iron pipe and soil will cause excessive soil-pipe interface 
shear that may cause abrasion of the pipe coating. This abrasion may lead to premature 


corrosion of the pipe exterior (Yen et al., 1981). 


2.2.3. Pipe-Wall Temperature Gradients 


For longitudinal failures, a suspected failure mechanism is the high temperature gradient 
occurring across the pipe wall. If the temperature difference of the transported water and 
- surrounding soil is significant, this temperature gradient can lead to unusually high hoop 
stresses, subsequently leading to failure (possibly due to a water pressure surge) 
(Habibian, 1994). Longitudinal failures may also occur in combination with the 


weakening of the pipe wall due to corrosion, at the weakest portion of the main wall. 
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Another possible cause of longitudinal failure is due to a crushing load. This usually 


occurs in the larger diameter pipes (ODay, 1982). 


2.2.4 Corrosion 


Corrosion failure is directly or indirectly associated with the reduction in pipe wall 
thickness. These failures may occur as the result of internal or external corrosion. 
Possible types of internal corrosion include: bacteriological (Lutey and Mason, 1994); 
chemical corrosion occurring as internal pitting (Morris Jr., 1967) and (Quraishi and Al- 
Amry, 1992); or galvanic action (Morris Jr., 1967), although this type is less common. 
External corrosion may be caused by: galvanic action (Morris Jr., 1967); electrolytic 
oxidation due to low pH or stray currents (ODay, 1982), or; bacterial, as sulfate 
reduction (Morris Jr., 1967). Potential sources of stray direct current in Edmonton may 
include electric railways (transit system and the Light Rail Transit system) and industrial 
equipment. The corrosion-weakened wall may fail by pin-hole failure, or may result 
from combination with one of the above mentioned failures (“blowout” or longitudinal 


failures). 


2.2.5 Other 


There are other factors that must be considered as causes for pipe failures. These causes 


are unpredictable circumstances that must be accounted for, or eliminated in the ANN 


models. Special phenomena such as spatial and temporal clustering of pipe breaks must 
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also be investigated. All of these factors should be examined for ANN models, such that 
the models will not confuse these types of events with frost heave, soil-pipeline 


interaction, pipe wall temperature gradient, and corrosion failure mechanisms. 


Discussion of the phenomena of spatial and temporal clustering of pipe failures must be 
studied. The potential of vehicle loading must also be examined. Pipe age and distance 
from facilities may be considered, but some studies argue that pipe age is not a major 
determinant of water main break rates (O'Day, 1982). Instantaneous pressure surges 
causing water hammer and sudden pressure changes are examples of unpredictable 
events (which may cause multiple failures). These above types of failures must be 


considered, and then either accounted for or eliminated before modeling can proceed. 


Having analyzed the type of failures and then summarizing the failure mechanisms, the 
Artificial Neural Network methodology must use reasonable model input parameters, 
which will allow it to characterize the cause-and-effect relationship of pipe failure 
mechanisms. Potential input parameters, based on the failure mechanisms, are outlined 


in the following section, with justification for their significance. 


2.3 Parameters That Cause/Influence Breaks 


In this section a discussion of potential causal and influencing factors effecting pipe 
break failure mechanisms is presented. Literature cites only one example where the 


Artificial Neural Network methodology was used, for relating water distribution damage 
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to natural hazards. This paper attempted to relate cold temperature hazards (air 


temperature, snow precipitation, and degree-days) to historical pipe damage (Bates et al., 


1996). Unfortunately, neither was a discussion given as to the reason for including the 


specific factors, nor were quantitative results presented which demonstrated the accuracy 


of the model. The following parameters are presented with reasons supporting their 


inclusion. 


Pipe Diameter 


Review of pipe break literature indicates a strong correlation between the number of pipe 


breaks and the diameter of the pipe. A study of pipe breaks conducted in Winnipeg, 


Manitoba concluded that “the decreasing trend in pipe failure rate for cast iron pipe with 


increasing diameter is directly attributable to the increasing wall thickness and joint 
reliability with increase in pipe diameter. Larger wall thickness gives the pipe better 
structural integrity and improved resistance to corrosion failures” (Kettler and Goulter, 
1985). Many other studies have also shown that a larger proportion of failures have 
occurred in the smaller diameter pipes [(Kitaura and Miyajima, 1996), (Bahmanyar and 


Edil, 1983) and (Rajani et al., 1996)]. 


_ Literature suggests that the pipe size also affects the mode of failure (O'Day, 1982). 
Smaller diameter mains (150 to 200 mm) often experience beam (flexural) failure 
because of poor bending conditions, however crushing failures (often longitudinal 


failures) are unlikely to occur due to the relative length-to-diameter ratio. Conversely, 
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larger mains (250 mm or greater) are likely to experience crushing failure, but are not 


likely to experience beam failure (O'Day, 1982). 


Age of Pipe 


The significance of pipe age as a determinant of pipe breakage is debatable. Some 
experts believe in a natural progression of occurrences of pipe breaks with age [(Goulter 
et al., 1990) and (Bates et al., 1996)]. Others have indicated that “studies show that age 


is not the major determinant of water main break rates” (O'Day, 1982). 


Pipe Joint Type 


Joint type is an issue since the type of joint will influence the susceptibility of the pipe to 
specific failures. A large part of this may be owing to the amount of flexibility and 
lateral constraint the joint provides, as well as the pipe joint’s actual strength and its 
ability to resist corrosion. For cast iron pipes in the previously mentioned Winnipeg 
study, joint failure is predominant with bolted and universal joints (Goulter and Kazemi, 
1989). Kitaura reported “joint separations for cast iron pipe occurred in the older lead 
and mechanical joints” (Kitaura and Miyajima, 1996). Morris Jr. speculates that certain 

~ types of bolted or welded joint connections are more susceptible to corrosion (Morris Jr., 


1967). 
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Joint separations are leak failures where pipe joints become separated. The study of the 
failure of a 108-inch pipe demonstrated that “total pipe separations occurred because of 
large unrelieved thermal stresses and stress amplification cause by the eccentricity of the 
welded bell-and-spigot joints (Moncarz et al., 1987). A study by Milligan showed that 
“different filler materials will have differing abilities to accommodate a greater deviation 


off line before damage occurs” (Milligan, 1995). 


Internal Pipe Water Temperature 


Some literature speculates that a high differential temperature between the internal and 
external pipe wall can produce high temperature gradients. Under such conditions the 
inner and outer fibers will be subjected to different temperature drops, resulting in 
differential strains and circumferential stresses. The increase in hoop stress increases the 
likelihood of longitudinal failures (Habibian, 1994). A co-author of a study performed in 
Madison, Wisconsin did not agree that this mechanism was a problem for seasonally cold 


regions (Bahmanyar and Edil, 1983). 


Operating Pressures 


. For circumstances where water pipeline pressure surges result in blowout (longitudinal) 
failures, a parameter that depicts changes in operating pressure (and therefore changes in 
circumferential stresses) is necessary. Such events may occur as pressure surges (during 


pump shut down or other normal pump operations) or during unforeseen events 
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(accidental valve closure, etc.). Such events may cause a water hammer effect, which is 


a likely explanation for a limited number of pipe failures. 


Environmental factors exist which may intensify or conversely buffer the pressure 
experienced by the pipe. A study by Burrows and Qiu (1995) indicated that the presence 
of air pockets in pipelines can exacerbate surge peak. A typical example of this 
possibility is provided in Figure 3. Conversely, work by Rajani et. al. (1996) indicates 
that at lower ground temperatures, the elastic moduli of soils can increase significantly 
such that frozen soils will have a positive counteracting effect on the development of 


hoop stresses. 


Arrows show possible direction of air accumulation 
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. Figure 3. Effects of air pockets on operating pressures. 
Adapted from (Burrows and Qiu, 1995) 
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Soil Type 


The significance of the type of soil cannot be overlooked, as it is one of the most 
important factors, having effects on almost all of the above mechanisms. Its effects on 
frost heave, strength of soil-pipeline interaction strength, and external corrosion can be 


important for many failure mechanisms. 


Frost susceptibility is defined as the rate at which frost penetrates the ground. It is 
generally regarded as one of the most important factors in characterizing frost heave 
action. Frost susceptibility is ranked greatest to least for soil types in the following order: 
silt, clay, sand, and then gravel. However, methods of further quantifying and thoroughly 
characterizing soils in terms of frost susceptibility are not consistent. Use of frost heave 
rate (mm/day), total frost heave (mm), frost heave ratio (ratio of frost heave rate to total 
frost heave) and segregation potential (to depict frost susceptibility (Kujala, 1993)) have 
been suggested. However, these types of measures are often difficult to find, or do not 
translate accurately from laboratory to field values (Konrad and Nixon, 1994). Others 
disagree, indicating that “for a constant pressure (load on the soil) the rate of heaving is 
independent of the rate of freezing...[which is] completely valid only for relatively 
permeable soil” (Penner, 1972). Therefore characterization of frost susceptibility, and 


hence frost heaving is difficult using field measurements. 


The type of soil the pipe is located in is also important for the aspect of differential 


heaving and thaw settlement. If a pipe is located at the interface of two different soil 
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types, it has been shown that each soil will experience an uneven amount of frost 
heaving, and therefore have an influence on the amount of strain experienced by the pipe 
(Nixon, 1994). In the same manner, thaw settlement will lead to differential stress 


distributions on the pipeline. 


Several studies have demonstrated that freeze-thaw cycles have effects on the mass 
transfer and physical properties of the soil [(Kurilko et al., 1989); (Kujala and Laurinen, 
1989); (Rajani, 1992), and; (Pawluk, 1988)]. Results have also shown that textural 
changes caused by freezing and thawing of clayey materials may alter the mass transfer 
characteristics by two or three orders of magnitude, and as a result, affecting frost heave 
rates. However, the influence of cyclic freezing and thawing on sands is not observed at 


all (Kurilko et al., 1989). 


Use of soil type to represent various soil properties is a difficult task due to the seasonal 
variance of several properties (deemed important to frost action, soil-pipeline interaction, 
and corrosion). In addition, many of the soil parameters influencing frost heave and 
corrosion will not be available due to Semler in monitoring. Therefore, these 
parameters may have to be inferred from soil type. Important parameters which are 
assumed to be constant with soil type are: soil thermal conductivity (Konrad and 
Morgenstern, 1980); Poisson’s ratio (Shen and Ladanyi, 1991) and (Selig, 1988); and 


hydraulic conductivity (Anderson et al., 1984). 
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Soil corrosivity is a soil characteristic that must be considered for external corrosion 
predictions. Physical characteristics (particle size, friability, uniformity, organic content, 
color) have reflected corrosivity, based on observations and testing. Color has also been 
linked to corrosivity. Soil uniformity is important because of the possible development 
of localized corrosion cells. Corrosion cells may be caused by a difference in potential 
between unlike soil types, with both soils being in contact with the pipe (Smith, 1968). If 
it can be assumed that for a particular soil classification the approximate uniformity 


coefficient can be estimated, then the possibility of corrosion can be estimated. 
Overburden Pressure 


Overburden pressure is thought to be important due its ability to help characterize frost 
heaving and soil-pipeline resistance. It can be characterized by the depth of bury and soil 
density. To simplify assumptions, it will be assumed (for this study) that soil density is 


generally characterized by soil type. 


With respect to frost heave action, overburden pressure is another important factor for ice 
lens formation (the others being: frost eiisaaptiBis soil, freezing temperature and a source 
of water). Literature indicates that the overburden pressure is important for the rate of 

_ heaving [(Anderson et al., 1984); (Nixon, 1994); (Hu and Selvadurai, 1995), and; (Roy et 


al., 1992)]. 
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Bury depth is an important factor for other reasons. From the perspective of soil-pipeline 
interaction, it has been demonstrated that the frictional soil resistance is affected by pipe 
diameter and bury depth (Rajani et al., 1995). Also, from the perspective of mode of 
failure, larger pipes are more susceptible than smaller pipes to crushing failure. This is 
due to bury depth, or the external loadings the pipe is subjected to (i.e. roadways, large 


structures (O'Day, 1982)). 
Segregation Potential 


Konrad thoroughly investigated segregation potential for characterizing frost 
susceptibility [((Konrad and Morgenstern, 1981); (Konrad, 1987); (Konrad, 1994), and; 
(Konrad and Nixon, 1994)]. This parameter is determined using laboratory 
measurements and is a proportionality constant comprised of the measured hydraulic 
conductivity (as pore-water velocity) and temperature gradient (Konrad, 1994). The 
value obtained depends on the stress and thermal histories of the soil deposit. This 
parameter may be especially useful as it combines two of the parameters that are 
important for frost susceptibility characterization, namely hydraulic conductivity (which 
also relates to water content) and temperature pedieni (which relates to the freezing 


rate). 
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Soil-Pipeline Reaction Modulus (k,;) 


The axial soil-pipe reaction modulus, k,, is a parameter which describes the interactive 


resistive strength created by the soil and pipeline interface (k, is typically expressed as 


MPa/m). Determination of this reaction modulus is done in one of several ways, either 


from elastic properties or empirical relationships from sand and clay (Rajani et al., 1996). 


These relationships are illustrated in the following equations: 


ki, = Sheet 


sje Cnie= — 


2 


Where: 
D is the external diameter of the pipe, mm 
G, is the soil shear modulus, MPa (N/mm’) 


v, is the soil Poisson’s ratio 


Where: 
is the adhesion coefficient 


Sy is the undrained shear strength of clays, MPa 


uy is the displacement required to develop ultimate axial resistance, mm 


[1] 


[2] 
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Where: 
Y; is the submerged unit weight of soil, KN/m? 
H is the burial depth of water mains from surface to the centre line of the pipe, m 
K, is the coefficient of active resistance at rest 


6 is the frictional angle between the pipe material and surrounding backfill 


This soil-pipe resistance parameter is not a widely accepted parameter since investigation 
using this parameter is relatively new. Much of the recent research carried out has been 
performed by Biggar and Sego (Rajani et al., 1996). While this measure is ideal because 
it accurately measures soil-pipeline interaction, these values are not readily available. 
However, it is demonstrated that in general soil-pipe resistance increases with pipe 


roughness (Rajani et al., 1996). 
Soil Elastic Modulus E, 


Literature has shown that the soil elastic modulus is a representative stiffness property for 
soil-pipeline interaction [(Rajani et al., 1996) and (Selig, 1988)]. Therefore, the soil 


elastic modulus will give a measure of the strength of the soil-pipeline interactions. This 
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is especially important if non-uniform soils are present. This parameter also may have a 


counteracting effect to the development of hoop stresses during water pressure surges. 


Soil pH (and Soil Resistivity) 


In order to characterize external corrosion, it is necessary to find parameters which 
indicate the corrosivity of the soil. Soil pH is a good general indicator of external 
corrosion since certain pH ranges allow for different corrosion mechanisms to occur. It 
has also been found that resistivity is a function of pH [(Morris Jr., 1967); (Booth et al., 
1967), and; (Jarvis and Hedges, 1994)]. For that reason, only one of the two may be 


required for characterization. 


Literature indicates a very poor correlation between soil type and soil resistivity (Dorn, 
1989). For this reason, soil resistivity cannot be incorporated into the soil type 
parameter. This being the case, soil type and either soil resistivity or soil pH may be 
considered as potential input parameters, but inclusion of both soil resistivity and soil pH 


are not necessary. 


Soil pH can be divided into three important ranges: 0 to 4, 6.5 to 7.5 and 8.5 tol4 
(Morris Jr., 1967). At a pH of 0 to 4, the soil acts as an electrolyte. In the neutral range, 
pH is optimum for sulfate reduction. At a pH of 8.5 to 14, soils are generally high in 
dissolved solids, and thereby yield a low resistivity [(Morris Jr., 1967) and (Jarvis and 


Hedges, 1994)]. 
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Interpretation of soil resistivity field measurements is extremely important. Only when 
reading the resistivity in soil at the specific pipe depth can the interpretation (of corrosion 
potential) be made accurately (Smith, 1968). As a result, the ground water content and 
soil temperature must also be ascertained. With cast iron pipe, corrosion resistance is 
enhanced if there are dry periods during the year. This seems to permit hardening or 
toughening of the corrosion scale or products, which then become impervious and serve 
as a better insulator (Smith, 1968). Also demonstrated by Smith was that resistivity will 
vary with the soil temperature. As the soil approaches freezing, resistivity will increase 
greatly, and thus a reliable reading may not be possible. If resistivity is to be measured, 
consideration must be given to a lack of consistent readings between field and laboratory 
measurements. Therefore it is necessary to assign ranges of resistivity, rather than 


specific numbers (Smith, 1968). Common ranges for soil resistivity are given in Table 1. 


Table 1. Common ranges of resistivity for soils. 
Adapted from (Dorn, 1989) 


CORROSION CLASS 
Severly Corrosive Oto5 


Very Corrosive 5 to 10 


Moderately Corrosive 30 to 100 
Slightly Corrosive Above 100 


RESISTIVITY (METRE-OHMS) 
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Soil Aeration (Redox Potential) 


This parameter is useful in characterizing the potential for bacterial corrosion. It is an 
established fact that sulfate-reducing bacteria can live only under anaerobic conditions (a 
redox potential greater thanl100 mV indicates sufficient soil aeration so as not to support 
sulfate producers (Morris Jr., 1967). However, Jarvis and Hedges (1994) contradict the 
above, stating that values less than 400 to 430 mV indicates a suitable environment for 
sulfate-reducing bacteria. Overall, while there may not be a general agreement on the 
range for which sulfate corrosion is favorable, there is agreement that this measurement 


provides potential for characterizing bacterial corrosion potential. 


Soil Water Content 


Use of the soil water content parameter is important from several aspects. As mentioned 
earlier, the rate of frost heave is controlled by the availability of free water (McGaw, 


1972). It is also important for external corrosion. 


From the perspective of frost heave, it has been stated that the availability of a water 
source is one of the necessary elements required for ice lens growth. In the absence of a 
- nearby ground water table, focus then shifts to the availability of water present in the soil 
itself, ie., soil water content. In reality, the water content may be a possible surrogate 


measure for water table depth, as water may enter the soil above by capillary suction. 
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From the perspective of external corrosion, soil corrosion aggressiveness has been related 
to moisture content. Soils with a moisture content above 20 percent (wet basis) are 
thought to be particularly corrosive (Jarvis and Hedges, 1994). Another study cited a 
correlation between soil aggressiveness and an optimum water content at a minimum 
resistivity. Therefore, these results substantiate moisture content as a measure of soil 


aggressiveness (Booth et al., 1967). 


Anderson and Tice (1972) provided a study that demonstrated that water content may be 
related to soil temperature and specific surface area. In this study, an empirical equation 
was devised relating the unfrozen water content of partially frozen soils to the soil 
temperature and specific surface area. Results comparing computed water contents and 
experimentally obtained values showed good agreement, particularly for temperatures 
below —5 °C. Another study demonstrated how water content is also affected by the 
salinity of the soil (Jones, 1995). These observations again demonstrate the importance 


of soil type as a general indicative parameter. 


Cluster Indices 


As mentioned earlier, several studies have been performed that focus on the analysis of 

_ causes of cast iron water main failures. There has been particular focus on the temporal 
and spatial correlation of break events. The Winnipeg study performed by Goulter 
indicates that pipe failures often occur in clusters, located in relatively close proximity of 


other breaks [(Goulter et al., 1990); (Goulter and Kazemi, 1988), and; (Goulter and 
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Kazemi, 1989)]. The same study shows that the likelihood of a break occurring 
decreases with time from the break of another pipe in the area [(Goulter and Kazemi, 
1988) and (Goulter and Kazemi, 1989)]. Further investigation of these phenomena is 


needed, to determine its significance. 


Air Temperature 


Air temperature is an important parameter since it characterizes the change in climate, 
and on a smaller scale, the change in seasons. It is integral for characterizing the 
potential for frost heaving and soil-pipeline interaction stress generation. It also affects 


the measurement and stability of soil parameters. 


Deterministic models show there is a relationship between air and ground temperatures 
(see Figure 4). However, heat transfer from ground to air requires time due to the 
reduced thermal conductivity of the soil (compared to air). Since time is required for the 
ground temperature to equilibrate to the ground surface temperature (i.e. temperature at 
the ground-air interface), ground temperature can be considered a function of depth. In 


this manner, air temperature indirectly affects pipe failure mechanisms in two ways: 


- 1. temperature-induced contraction, and; 


2. frost heave mechanics (ODay, 1982). 
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Permafrost Non-Permafrost 
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Mean annual soil surface temperature 


Envelope of 
minimum ground 
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Figure 4. Whiplash curve for ground-to-air heat transfer. 
Adapted from (Phukan, 1985) 


As illustrated in the above deterministic model, air temperature is not indicative of the 
ground temperature unless the ground temperature is considered as a function of time and 
depth of interest. In using the Artificial Neural Network modeling approach, presenting 
the model with time-series data may allow the model to characterize the air-to-ground 

- temperature transition effects without using the actual ground temperature (at pipe 


depth). This of course assumes that the conductivity of different soils is accounted for. 
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It is speculated that the daily drop in air temperature will be indicative of the rate of frost 
penetration [((McGaw, 1972); (Miller, 1972); (Anderson and Tice, 1972); (Penner, 1972); 
(Anderson et al., 1984); (Shen and Ladanyi, 1991); (Hu and Selvadurai, 1995); 
(Bahmanyar and Edil, 1983), and; (Roy et al., 1992)]. Some of these aforementioned 
authors have also reported that there is a correlation between a drop in air temperature 
and an increase in pipe breaks. One conflicting opinion, expressing that air temperature 


drops were not responsible for pipe breaks, was found (Habibian, 1994). 


Use of air temperature (or differences in air temperature) may also be considered as a 
surrogate measure for freezing and thawing indices (Boyd, 1973). Use of these indices in 
prediction models, and use of air temperatures in time series may give a more realistic 


representation of climatic changes. 


Consideration of air temperature is also important for its effect on many soil properties. 
Constant monitoring of changes of soil water content, hydraulic conductivity, undrained 
strength, elastic modulus (Shen and Ladanyi, 1991), resistivity, and depth of 
consolidation is not feasible. It is anticipated that use of the air temperature parameter in 
conjunction with precipitation and soil type will allow the ANN model to account for 


these changes in parameters. 
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Precipitation (snow and/or rain) 


Snow is indicative of the insulating effect on ground temperature, as the snow will allow 
for the entrapment of heat into the ground. Rain precipitation coupled with the soil type 
may be indicative of moisture content or hydraulic conductivity if these parameters are 
not measured regularly. Some literature indicates that corrosion resistance is enhanced 
during dry periods of the year (Smith, 1968). Rain precipitation may also indicate an 
abundance of water supply [(Penner, 1972); (Anderson et al., 1984); and; (Roy et al., 
1992)]. Therefore, inclusion of this parameter may be necessary to help characterize 


climatic changes as well as to infer adjustments to soil parameters. 


Rain precipitation may be a significant factor if water main breaks can be related to the 
ewelling and consequently the instability of saturated clay soils during heavy rainfall 

events. This is based on the fact that many clay soils swell and shrink according to the 
soil moisture content to a high degree and exhibit a high plasticity and cohesion (Clark, 


1971). 


Summary of Parameters 


_ Based on the literature reviewed, a conceptual list of causal and influencing parameters is 
presented in Table 2. These are input parameters which should ideally be included in the 
ANN model. Analysis of data availability or reliability will determine the feasibility of 


their inclusion. 
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Table 2. Summary of Influencing and Causal Factors. 


Pipe strength characteristic 

Possible corrosion factor 

Strength and rigidity of connections 
Pipe-wall temperature differential stresses 
Water hammer; high water pressures 


Soil type Frost action; Soil-pipeline interaction; Corrosion 
related 


Frost heave characterization; Soil-pipeline strength 
Frost susceptibility 

Soil-pipeline resistance 

Soil-pipeline interaction 

Corrosion parameter 

Corrosion potential 

Frost action mechanics; corrosion 

Pipe break phenomena 


Air temperature Frost action; Soil-pipeline interactions; Soil 


POSSIBLE INFLUENCES ON PIPE BREAKS 


parameter characterization 
Precipitation Soils stability (especially clays); ground 


temperature insulating effects 
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2.4 Artificial Neural Networks Overview 


With the influencing and causal factors now outlined, it is demonstrated that there are 
wide arrays of factors that have potentially significant effects on each type of pipe 
failures. There are established relationships and interactions between many of these 
factors. It can also be mentioned that although many influencing factors are identified, 
there is uncertainty as to their roles for causing cast iron pipe breaks. Expert knowledge 
as to causes of failures is reasonably advanced yet past attempts at modeling these 
failures have not yielded satisfactory conclusions. An alternative method of modeling, 
using Artificial Neural Networks is presented which has the potential to model the pipe 


breaks. 


2.4.1. Background Information and Application 


The Artificial Neural Network modeling technique, though not conceptually new, has 
only recently been explored in many fields of civil and environmental engineering 
because of its requirements for intensive computing capabilities. With the advancements 
in computing technology, this modeling technique is gaining popularity for its abilities to 
deal with problems having non-linear solutions, and particularly with its ability to 


forecast events. 


This modeling approach is generally regarded as one of the best at extracting concepts 


from historical data, and has a strong ability to learn, and thus has the ability to forecast 
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future events within its study domain. Since the ANN modeling technique utilizes an 
organizational structure that has numerous interconnections, it does not rely upon 
deterministic mathematical equations. As a result, its structure is configured such that it 


can handle complex, non-linear problems. 


Appropriate application of Artificial Neural Networks requires that the following 


characteristics of the problem application exist: 


1. The algorithm required to solve the problem is unknown or expensive to discover; 
2. Heuristics or rules required to solve the problem are unknown or difficult to establish 
and; 


3. The application is data intensive and a variety of data are available (Zhang, 1996). 


As previously mentioned, earlier attempts at developing an analytical procedure using 
conventional methods for predicting water main failure have not proven successful. The 
possibility of modeling pipe break failures using ANN exists provided that adequate, 
easily accessible information can be collected. , Ideally, a comprehensive model will 


involve data inputs from the causal and influencing parameters described above. 
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2.4.2 ANN Model Development Process 


This general process of developing an Artificial Neural Network is composed of four 


interdependent stages: 


1. Source Data Analysis; 
2. System Priming; 
3. System Fine-Tuning; and 


4. Model Evaluation. 


Source data analysis involves identifying and preparing the potential input parameters for 
use in the Artificial Neural Network models. Description of the considerations for this 
model study is to be detailed in Section 3.1. System priming involves determining which 
of the potential input parameters will be most appropriate for the problem study. The 
system fine-tuning involves adjusting the Artificial Neural Network model structures to 
optimize learning of the input data presented. The system priming and system fine- 
tuning stages are often done concurrently and are described in Section 3.2. In the final 


stage, Model Evaluation, the models are evaluated both qualitatively and quantitatively. 


These stages will be described in more detail in Section 3. For the ANN modeling 
portion, the program Neuroshell 2, developed by Ward Systems Group, Incorporated was 


used (Ward Systems Group, 1993). 
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2.4.3. ANN Model Structure 
General Concept 


Artificial Neural Networks is one classification of Artificial Intelligence (AI). It is a 
“black box” methodology that typically “learns” by comparing an input pattern or 
sequences of input patterns, to the outputs. Relationships between model input and 
model output are formed using an error correction algorithm, which attempts to minimize 
the error between model prediction and the actual events. Error corrections are 
distributed amongst the hidden neurons within the model. When presented with a wide 
range of appropriate input patterns and a suitable model structure, these models are able 


to intrinsically learn the underlying logic features and importance of these patterns. 


ANN technique is a form of artificial intelligence that simulates what we think we know 
about how the brain works (Schmuller, 1990). An Artificial Neural Network is 
composed of a set of simple processing units called “neurons” (as illustrated in Figure 5), 
each capable of only a few computations such as summation and threshold logic (Garrett 
et al., 1992). These neurons are aroneed in layers, with the neurons from each layer 
being interconnected to one another. This configuration lends itself to self-organization 


and learning. 


35 


i ea 
pa 
moa 
era 
; ry, 4 
vy + 
vi} Ge 
‘ ya : 
teks <a 
‘ane a 
¥ 
) 
¢ 
i 
i Po cee 


% 


kei AA) soceq bom sioflind to nobis: 


jie oe 


tg Sr LUGE OD OETA vid “enna? 

fins gd (bore ger varied’ rae noitlal sg 9 ms 

siacinkn 02 xamrsaine digits comattnongte myeneteai y 

8 irotoynOD TOT -gaaeve on ee a AC aes 

t m Pah a 

bivy o tie Baines nett A jalyons tty te ee gt 

Relea 

sits sxe ablbown veils quiigune cata tier ; ei, 

a oo a: 


! ae 


sah iathad sreght }Y bag sarin rk gts 


Worth OW ge 8 mime vaccum ait 
voi wih oo wince 
AZ oyee at wet om” chr anawenevinnynee 


prow. sotto oa afi tor acini nl ge 
Pe i? me ¥ oa as 
me ul ete on 2 pits ot ere nace i neat saci? a 


—e 


Output Layer 


wee ee Ke ee 


Hidden Layers 


Figure 5. Typical ANN backpropagation structure. 
Modified from (Sacluti et al., 1998) 


The Artificial Neural Network learning process involves the entry of significant input 
parameters into the model, with the output parameters known (for “supervised” learning). 
Conversely, “unsupervised” learning is where the output parameter is unknown. Input 
data (often modified and/or placed in time series) enters each individual neuron with an 
initial weight value. Depending on the scaling function (or activation function in 
subsequent layers), a new significance value is assigned to the output signal. These 
values are conveyed to other interconnected neurons in subsequent layers until an output 
parameter is determined. The actual learning occurs when feedback iterations are 


performed, and the model begins to organize itself according to the data it is presented. 


Optimal and organized model development involves selecting appropriate input 
parameters and suitable ANN model features. These structures may be internal or 


external to the neuron units. Those structures internal to the neuron are the scaling and 
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activation functions. The most significant model feature external to the neuron is the 


type of learning architecture. 


Scaling and Activation Functions 


Data initially fed into the model must be scaled from their numeric range into a relative 
range for which the network can deal with more efficiently (Ward Systems Group, 1993). 
The model structure performing this is called the “scaling function”. Scaling may be 
linear (Equation 6), but may also use the non-linear scaling functions logistic (Equation 
4) or tanh (Equation 5). These non-linear scaling functions will tend to group data at both 
the higher and lower limits of the original data range. Neuroshell 2 sets the default 


scaling function to the linear function. 


Activation functions are structures in layers subsequent to the input layer (i.e. hidden and 
output layers). They dictate how the individual neurons pass neuron output weight 
values from the summed neuron input weight values of the previous layer. The 
activation function maps the inputted sum into the output weight value, which is then 


passed onto the succeeding layer. 
-NeuroShell 2 provides a number of activation functions that allow for flexible application 


to problems. The more significant activation functions are: logistic (Figure 6), linear 


(Figure 8), tanh (Figure 7), Gaussian (Figure 9), and Gaussian complement (Figure 10). 
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The default setting for models is the logistic function. This function has been found to 
work best for most neural network applications (Ward Systems Group, 1993), however, 


there are always exceptions to this. 


The logistic function is mathematically described as: 


1 
f(x) = [4] 


1+exp(- x)) 


This is graphically represented by: 


| 
| 
| 


Figure 6. Logistic activation function. 


Adapted from (Ward Systems Group, 1993) 


38 


ot bevel ceed sith inodisinft aisf AOE 


tavewod AEE f qe id 2s119 ay 0 a 


Mex 


The hyperbolic tan function is: 
f(x) = tanh(x) [5] 


and is illustrated by: 


Figure 7. Tanh activation function. 
Adapted from (Ward Systems Group, 1993) 
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The linear activation function is given by: 
f(x) =x [6] 


and is graphically represented by: 


Figure 8. Linear activation function 
Adapted from (Ward Systems Group, 1993) 
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The Gaussian activation function is mathematically described by: 


Gaussian = exp(- x’) 


%) 95 


Figure 9. Gaussian activation function 
Adapted from (Ward Systems Group, 1993) 


[7] 
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The Gaussian Complement activation function is described by: 


Gaussian complement = 1 - exp(— ts ) [8] 


Figure 10. Gaussian complement activation function 
Adapted from (Ward Systems Group, 1993) 


Learning Architectures 


As stated earlier, the program used for model development, Neuroshell 2, includes 
several different types of Artificial Neural Network supervised learning architectures. 
These architectures include the standard Backpropagation networks (Figure 11), but also 


includes Probabilistic Neural Networks (PNN) and General Regression Neural Networks 
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(GRNN). Within each of these learning architectures, especially the Backpropagation 
networks, unique features of the network structure give way to distinct model types such 
as Jump Connections networks (Figure 12), Recurrent networks (Figure 13), and Ward 
networks (Figure 14). This subset of backpropagation networks is designed to provide 
flexibility in design of the network, and vary the method of data presentation (Ward 
Systems Group, 1993). This flexibility may allow the network to capture specific 
features of the data set or problem study, not as easily captured with a standard 


connection backpropagation network. 


Pil trier 
lalalels 


Figure 11. Standard connection network structure. 


Modified from (Ward Systems Group, 1993) 
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Figure 12. Jump connection network structures. 
Modified from (Ward Systems Group, 1993) 
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Figure 13. Recurrent network structures. 


Modified from (Ward Systems Group, 1993) 
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Figure 14. Ward network structures. 
Modified from (Ward Systems Group, 1993) 


The Backpropagation architecture offers the greatest amount of flexibility in NeuroShell 
2 since there are a number of network options. Standard Connection networks are the 
simplest form of ANN model, with each layer of neurons connected to the adjacent layer. 
Jump Connection networks allow for more involved linking, such that every layer is 
connected to each other layer, not solely to the adjacent layer. The Recurrent Networks 


have an additional input layer which stores the contents of the previous pattern that was 
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trained, allowing the network to see previous knowledge it had about previous inputs 
(Ward Systems Group, 1993). A final backpropagation type model offered in NeuroShell 
2 is the Ward Network, developed by the supplier of the program. This network employs 
the use of varying activation functions, designed to identify different features of the data 
set. Thus, the output layer receives different “views of the data” (Ward Systems Group, 


1993). 


The Probabilistic Neural Networks and General Regression Neural Networks are both 
three layer network architectures in which the input patterns are presented to the input 
layer, and each individual input pattern is retained by one or more hidden layer neurons. 
The output from the individual pattern is either categorized according to a probability 
density function (for PNN) or presents a continuous value output (for GRNN) based upon 


a comparison to all retained patterns. 


The initial layer of all networks is comprised of the model’s input parameters, with each 
input parameter assigned to an input neuron. These neurons are collectively called the 
input layer. The final layer is described as the output layer, and is composed of a single 
output (sometimes multiple outputs) that is (or are) being modeled. This is collectively 
called the output layer. The layers in between are described as “hidden layers” because, 
while each neuron serves as a unit-process decision (each neuron makes a decision based 
on the input it receives from other neurons) these decisions are intrinsic to the model, and 


these neurons’ outputs are not readily seen. The combination of the hidden neuron 
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outputs and interconnections between neuron layers culminate in the decision-making 


process of the model. 
2.4.4 ANN Model Learning Mechanics 


The ANN models employed for this study used an error backpropagation algorithm for 
learning. For this type of learning, input patterns fed into the model produce an initial 
model output. The resulting output is then compared to the actual outcome 
(corresponding to the input pattern) to determine the model’s prediction error. This error 
is then fed back into the hidden layer neuron interconnection weights. The error is batch 
corrected, and errors are distributed throughout the neuron connections. These modified 
weights attached to each incoming neuron signal, produce new output weights, which are 
governed by the type of activation functions utilized. These new output values are then 
transmitted to the neurons of the following interconnected layers, until a new output 
signal is derived at the output neuron. This error correction cycle continues until either 
the prediction reaches a minimum value (determined by a user-specified maximum 
iteration period) or the prediction error is below a user-specified error limit. Due to the 
complexity of the neuron interconnections, Boahectin weights will continue to develop 


until a set of stable connection weights is found. 
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2.4.55 ANN Data Set Presentation 


As stated earlier, ANN models learn by the presentation of input and actual output data 
patterns. When developing the model, it is important to realize that the purpose of 
presenting the model with various data patterns is to allow the model to intrinsically 
extract logic concepts from the data set. For this to occur, information presented must be 
representative of the full range of events, and a wide range of different patterns. The 


method by which the model is presented with the data is also important. 


Three data sets are used: training, testing and production sets. The training data set is the 
set of patterns from which input and output patterns are initially entered into the model to 
train the ANN model. The developing model cyclically compares itself against a test 
data set, from which the model calculates its progress (where the goal is to minimize the 
error between the actual output, and the output of the model). The development and 
comparison cycle continues until there is a minimum specified error in prediction, or the 
model is unable to progress further. The model can then be tested against the production 
set for model verification. The production data set is a set of data points, which the 
trained model has never seen before. This data set is used for quantitative measurement 
of model learning ability and feature extraction capability. The pattern file contains all of 


the data pattern sets, including training, test and production sets. 
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3.0 METHODOLOGY 


When assessing the feasibility of Artificial Neural Network methodology for research 
into pipe breaks, the modeling process must involve demonstration of knowledge and a 
logical experimental thought process. This section demonstrates these abilities through 


the collection, evaluation and use of data, and logical use of Artificial Neural Networks. 


3.1 Data Collection 


The effectiveness of an ANN model depends on the availability and reliability of the 
input data. Finding data that represents or corresponds to the possible factors reviewed 
was important for representing the physical cause-effect relationships. The reliability of 
the data is measured by the amount of “noise” inherent in the data. Noise are data 
patterns that contain inaccuracies and discrepancies, which does not allow the model to 
make proper associations between input and output patterns. Use of data with little 
apparent noise would result in a more accurate and precise model. As a result, precision 


in monitoring and collection of data was analyzed. 


Data collection involved evaluating all available data based on accessibility, relative ease 
of obtaining long-term relevant data, and the prospect of future availability of the same 
type of data for future models. This data must have characteristics that are significant for 
model convergence. If all the proposed model input parameters are used for the model, 


the run times for model training will be exceedingly long, and hence would be an 
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inefficient use of time. Also, if insignificant (or inappropriate) data is not eliminated 
initially, the redundant input parameters will be treated as “noise” by the ANN model, 


and as such may decrease the likelihood of model convergence. 


3.1.1 Parameter Collection and Analysis 


Artificial Neural Network modeling requires not only representative data, but meaningful 
forms of the data. For this reason, manipulation of the raw data may be required to make 
input patterns more meaningful to the model. In cases where multiple failure 
mechanisms depend on a specific parameter, but for different physical representations, it 
becomes necessary to represent the same raw data in different forms to reflect the 
significance of these different representations. For this purpose, input patterns are 
changed to be more indicative of the underlying failure mechanism. A review of the 
plausible model input parameters and justification of their transformations are presented 


below. 


Because of the nature of pipe failures, literature’ indicates that to fully predict these 
occurrences, it is necessary to have a wide range of representative data parameters. Due 
to limitations in the collection of input data, it became necessary to restrict the scope of 
the output being predicted. As mentioned earlier, the most important factors include pipe 
characteristics, pipeline operating characteristics, soils characteristics, soils properties, 


environmental properties and cluster indices. 
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Investigations into available raw data indicated that a large number of the suggested input 
parameters would not be available. Data for the various ideal input parameters that met 


the requirements (reliable and available in reasonable abundance) was difficult to collect. 


Source data relating to individual pipe characteristics was found in water main leak or 
break repair reports. This report is standard documentation to be included with any water 
main repair performed. The information contained on this report is detailed, since it 


includes: 


e pipe break location (street, avenue and distance from property lines); 
e pipe depth at bury and apparent frost depth; 

e report and repair dates; 

e pipe material; 

e size of pipe (pipe diameter); 

e nature of break (i.e., longitudinal, transverse, etc.); 

e apparent cause of break; 

e condition of pipe and/or coating or wrapping and; 


e site map of break. 


This data was obtained from hardcopy documents stored at the water yard, as this level 


of detail is not typically entered in computer spreadsheet form. 
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Air temperature and precipitation (rain and snow) raw data was available from the closest 
weather monitoring stations. In this study, meteorological data from the municipal 
airport was used, since this is a city-central monitoring station, and located fairly close to 
the study area. Operating pressures were available, although they were only average 
daily pressures, and pump shutoff and water hammer events were not typically logged. 
Water temperature values were available for the supplying reservoirs and water treatment 
plants. Soils characteristics (soil type, soil parameters, corrosion-related parameters) 


were not available in sufficient detail or with enough frequency in collection to be useful. 


The study was limited to modeling a city subdivision within the City of Edmonton’s 
water distribution system (Figure 15). The selected area was the Calder subdivision 
(refer to map on Figure 16). This study area was a high-density break area for which data 
from 1972 to 1994 were collected and input into a computer spreadsheet. It was 
unknown why this particular area was experiencing a high break density when 
surrounding areas had similar pipe infrastructure and soils characteristics, yet were not 


experiencing similar break activity. 
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Figure 16. Calder Sub-division Study Area 
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Because information was either lacking or too general for application within the scope of 
this study, simplifying assumptions had to be made. For example, it was assumed that 
the subdivision was a uniform soil type. Soil maps of the Edmonton area (Kathol and 
McPherson, 1975) indicated that the Calder area was composed entirely of a Malmo Silty 
Clay Loam. These maps also indicated that the majority of the Edmonton area is 


composed of a silty clay loam. 
Output Analysis and Format 


Due to the unavailability of some potential raw input data (instantaneous operating 
pressures, many soils parameters), it was necessary to model only those pipe failures for 
which causal and influential factors could be easily obtained. The abundance of 
meteorological data, water temperature, and average operating pressures allowed for 
effective modeling of frost heave, soil-pipeline interaction, and pipe wall differential 
stress failure mechanisms. Because many failures were not directly attributed to 
corrosion, but were intuitively contributing causes to the above failures, they were 
included in the modeling. Failures as a result of indeterminate causes (i.e. clamp 
failures) were excluded from modeling. Those failures caused by unpredictable pressure 
surges (such as water hammer events) could not be excluded, since operational log data 
did not indicate when such events actually occurred. However, these events were 


typically infrequent. 
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A water main pipe failure is defined as an event where the leaking water pipe requires 
repair. Analysis of the raw break data (water main repair forms) was performed in order 
to determine if individual failure modes were seasonal. Since break information was 
available from 1972 to 1994, analysis was performed on all break data. Results 
illustrated that transverse failures (Figure 17) and diagonal failures (Figure 18) appeared 
to be seasonal, as the highest frequency of breaks occurred in the colder seasonal months. 
Longitudinal failures (Figure 19), blowout failures (Figure 20), and clamp failures 


(Figure 21) do not seem to have a seasonal pattern. 
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Figure 17. Monthly transverse failures, 1972-1994. 
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Figure 18. Monthly diagonal failures, 1972-1994. 
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Figure 19. Monthly longitudinal failures, 1972-1994. 
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Figure 20. Monthly blowout failures, 1972-1994, 
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Figure 21. Monthly clamp failures, 1972-1994. 


The number of 150 mm diameter pipe breaks was decided upon as the output parameter 


of primary focus since this pipe size accounts for 314 of the 564 pipe breaks (55.7 
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percent) on record for the period of 1985 to mid-1991. Table 3 lists the breakdown of 
pipe breaks according to pipe diameter. The reason behind limiting the scope of the 
study to a single pipe diameter was also to eliminate possible confusion in pipe failure 
mechanisms, since literature has described different failures modes for different pipe 


Sizes. 


Table 3. Pipe breaks sorted by diameter, 1985-1991. 


Number of breaks Percentage of total 


i 
a 


It must be kept in mind that the model being developed will be able to predict the 
probability of pipe failures for a general area, but is not meant to predict the probability 
for individual pipes. As stated in the Section 1.2, the purpose of this model study is to 
demonstrate the possible utility of using ANN modeling for predicting pipe breaks. As 


such, further research and model development will likely be necessary. 
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Collection and Analysis of Raw Input Data 


Following the determination of the availability of data was the determination of the 
reliability of the data. Also to follow was the transformation of the data into input forms 
that reflected the physical causes and influences on the different failure modes. 
Investigation revealed that the best potential for input parameters was available for air 
and water raw data, average operating pressures, precipitation data and temporal 
clustering indices. Analysis of the raw data, data manipulation and their justifications are 


given below. 


Air Temperature 


Use of this data as a substitute parameter for ground temperature was warranted due to 
the virtual absence of ground temperature monitoring. Ground temperature is an 
excellent indicator of the physical processes affecting pipe breaks. Aside from the 
aforementioned hoop stress conditions it creates, it is also an important factor for 
modeling frost heave (Anderson et al., 1984) and soil-pipe interactions (Yen et al., 1981). 
Thus, it was seen of paramount importance to characterize ground temperature data in 
order to characterize the different failure mechanisms. The most appropriate surrogate 
‘parameter was air temperature, given that this can be used to characterize ground 


temperatures in time series (to account for time lags in temperature reaching the lower 


ground depths). 
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Air temperature was thoroughly analyzed through graphical methods in order to 
determine a time-series correlation. Either 1-year or 2-year analysis periods were 
selected to maintain reasonable accuracy of the analysis. A graphical analysis of average 
daily temperatures (Figure 22) was performed. As expected, this analysis demonstrated 
that temperature had seasonal variations. A first and second difference was taken from 
the mean daily temperature (Figure 23 and Figure 24, respectively) in order to determine 
if a time lag relationship was apparent. This was evident, and it was decided that the 


model input parameter for air temperature must include a time lag component. 


An analysis of the 7-day temperature change was also performed, and again, this analysis 
proved that a time lag component was necessary (Figure 25). This judgement was 
reasonable, since ground temperature transition models (e.g. Figure 4. Whiplash curve 
for ground-to-air heat transfer.) support this conclusion. Temperature transition models, 
such as this one, propose a slow thermodynamic transition of ground temperature from 
surface to increasing depth. Therefore, using a 7-day temperature lag was thought to be 


reasonable. 


To further determine which representations of the raw data are most indicative of the 
physical process, the information will be used in the Artificial Neural Network model. 


The results of the ANN model show which form of the parameter is most effective for 


model input. 
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Figure 22. Average daily temperatures, 1985. 
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Figure 23. First difference of average air temperatures, 1985-1986. 
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Figure 24. Second difference of average air temperatures, 1985-1986. 
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Figure 25. Seven-day air temperature difference, 1985. 
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Several alternatives were examined to determine which would best represent the different 
failure mechanisms acting on the pipelines. To imitate the major cold weather failure 
mechanisms (frost heave, soil-pipeline interaction, and water-ground temperature 
differential), multiple input parameters using air temperature were proposed. Due to the 
importance of cold weather, it was important that air temperature effects be accounted for 
in the model. As will be discussed in the model results and sensitivity analysis, the 
inclusion of multiple (modified) air temperature data as input parameters was justified in 


representing the different manners which air temperature affects pipe failure modes. 


Modeling frost heave mechanics involved presenting the Artificial Neural Network 
model with input patterns designed to mimic frost heave rate. The resulting formula, 
presented in Equation 9 was used to calculate the frost heave rate characterization 
parameter. A daily time frame for measuring temperature changes was chosen to show 
how daily changes in air temperature translated to a frost penetration rate in the soil. 


This was calculated as follows: 


ila t ) 5-1 [9] 


Frost Heave = Max(t 


Where: 


t = air temperature (°C) 


Large consecutive negative differential temperature values indicated a faster rate of frost 
penetration. Fluctuating temperature differential values (alternating negative and positive 


values, or small negative values) indicated little or no change in the frost heave rate. 
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Large, positive consecutive values may indicate some warming trends, possibly having a 
ground thawing effect the number of consecutive periods is extended. This may result in 
differential ground thawing along the length of the pipe, resulting in more failures. A 
time lag was also inputted in order to facilitate memory by the ANN model for 


previously described extended periods. 


Modeling of soil-pipeline interaction involved showing how sudden, significant changes 
in temperature, either during winter climate or transitions between warm and cold 
weather periods, resulted in pipe failures. To allow the ANN model to distinguish cold 
weather events and large temperature drops, two distinct measures were devised. The 
first parameter was a measure of the climate, warm or cold. The model was presented 
with a 7-day moving average of mean daily temperatures. The calculation for this was 


based on an arithmetic mean formula presented in Equation 10: 


Seven Day Average Temperature = [10] 


The second measure was a seven day change in temperature. This parameter was the 
maximum seven day temperature difference for the week previous to the data reference 


point, as illustrated in Equation 11. 


Maximum Seven Day Change = Max (ess = t;) i=1>7 [11] 
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The purpose of this parameter was to indicate a contraction of the pipe, causing an 
increase in tensile stress along the length of the pipe. As described in the soil-pipeline 
interaction section (as a failure mechanism), this type of action may result in pipe 


failures. 


Internal Pipe Water Temperature 


The possible sources of pipe water temperature were investigated. The initial source 


investigated were the Rosslyn reservoirs’ water temperatures, as this was the nearest 


water reservoir to the Calder area, and therefore was the most reflective of the water 


temperature within the water pipes. This data was analyzed for its reliability. Graphical 


analysis, as illustrated in Figure 26, revealed inconsistencies in trends, which would 


present “noise” for the ANN model, thus reducing its accuracy. 
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Figure 26. Graphical analysis of Rosslyn reservoir average water temperature, 
1985-1991. 


Referring to Figure 26, the data appears to have a well-defined trend from 1985 until 
1988 after which, the trend changed, appearing more erratic. This was discussed with 
Mr. Ken Richardson, supervisor of Water Transmission at Aqualta, and it was decided 
that this data was not reliable. The reason was that the reservoir sensors did not detect 
water temperature during station shut downs (lasting for weeks), but instead measured 
the room temperature. Thus the sensors did not measure water temperature unless the 
water was running. As a result of these inaccuracies, a surrogate measure using the water 
treatment plant intake temperatures was investigated. It was concluded that the water 
treatment plant water temperatures from which the reservoir water originated could be 


used as surrogate data. It was felt that this data was reliable since detention time in the 
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reservoir was not significantly long enough that the water treatment plant’s temperatures 
would not be would representative of reservoir water temperature. The seven-day 


average water intake temperature is shown in Figure 27. 
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Figure 27. Seven day average water temperature (Rossdale Water Treatment 


Plant) 1985-1991. 


Because the thermal conductivities of soil and water are different, meteorological 
fluctuations result in different rates of transition for water and ground temperatures. 
Drops in ambient air temperature following periods of relatively constant temperature 
Bae the thermal exchange rates from air to water and to soil in differing degrees. This 
results in a substantial temperature difference between the ground surrounding the pipe, 
and the water within the pipe. This difference creates a temperature gradient across the 
thickness of the pipe wall, resulting in differential strains and longitudinal stresses. This 


hoop stress condition increases the likelihood of longitudinal failures (Habibian, 1994). 
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Thus internal pipe water temperature, coupled with an indicator factor of ground 


temperature would help physically represent the occurrence of hoop stress conditions. 


In order to model pipe wall temperature differential stress, an expression for magnitude 
and temperature differential was required. This was calculated by the following formula 
(Equation 12): 
wt) (Sw, 
Seven Day Average (Air - Water) Temperature Differential = wh Fh aka [12] 
i=t n 
This input parameter would properly demonstrate instances where hoop stresses were 


probable to occur. 


It was thought that expressing a pipe wall temperature gradient exclusively might be 
incomplete because it did not reflect other stress conditions occurring concurrently. The 
presence of contributing stress conditions may be necessary for this type of failure. 
Because of this possibility, a magnitude indicator in the form of the 7-day average water 
temperature was added. It calculated in the samé manner as the 7-day average air 
temperature. This expressed the overall temperature conditions. High pipe wall 
temperature differential events occurring at sub-zero temperatures would be much more 
ea than those occurrences at warmer temperatures, speculatively due to 


decreased pipe ductility with decreased temperature. 
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Operating Pressures 


The main concern with operating pressure as a parameter is the effect of unexpected 
events such as pump shutdowns and other types of pressure-related problems. Water 
hammer effects resulting from pump shut downs and accidental valve closures can result 
in blowout failures, especially on pipe walls which have been weakened by corrosion 
effects or previously stressed pipe. Failures may also occur due to operating pressure, 


which are not within the design operating range for the pipe. 


Investigations into the availability of operating pressure data showed that only average 
daily pressures were recorded. For this reason, it was not reasonable for predicting 
failures occurring from water hammer effects, since these are instantaneous events, and 
seldom are accurate records kept. However, the benefit of using this parameter was not 
completely discounted. The average seven day operating pressure was included as a 
potential input parameter to determine if long-term operating pressures have an effect on 


the failures of the mains. 


Pressure information was gathered from reservoir data logs. The water pressure input 
parameter was simplified by assuming that the pressure from the reservoir servicing the 
subdivision would be an adequate indicator of the area’s overall average pressure. 
Further manipulation of data would have required specific knowledge of the elevations 
of various points within the service area, and calculating a hydraulic grade line. This 


manipulation was thought to be unnecessary since the reservoir area was static, and 
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therefore the same reference point was always used. In addition, due to the generality of 
the modeling (probability of pipe failures is only being predicted for the Calder 
subdivision as a whole unit, not for individual pipe breaks), this level of information was 


considered unnecessary. 


Rain Precipitation 


Rain precipitation is an important parameter when considered in relation to soil type. 

The Calder area is composed entirely of a silty clay loam. Clayey soils will be affected 
to a larger extent than sandy soils when there are periods of low precipitation followed by 
periods of high precipitation. The inclusion of this parameter may be especially useful 
for prediction of failures during the warmer seasonal months, as opposed to the cold- 
weather input parameters. This was a distinct possibility after graphically analyzing the 
pipe breaks, since not all of the water main failures occurred during the colder months, 


shown in Figure 28. 
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Figure 28. Total monthly breaks, 1985-1991. 


Several time periods were correlated in initial models, to determine which time period 
would best model the physical process. Results favored the inclusion of a 180-day 


moving average. This data was obtained in conjunction with the air temperature data. 


Cluster Indices 


Historical studies of water main failures in other cities show that pipe failures will often 
occur in clusters, as demonstrated by studies conducted in Winnipeg, Manitoba [(Goulter 


and Kazemi, 1988) and (Goulter and Kazemi, 1989)]. 


This phenomenon may be caused by a disturbance of soil surrounding a pipe break, due 


to the pipe failure (i.e. water seepage causing instability of the soil), or even the repair of 
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the pipe failure (excavation and replacement of soil causing disturbances to soil 
strengths). Regardless, the studies demonstrated that this type of phenomenon was 
significant. Due to the relative proximity of the failures (both in time and space), they 
might be mistaken for a multiple break event, when in actuality there would be multiple 
causes (i.e. the primary failure cause, and then the resulting additional failures from water 


seepage or other causes). 


Due to the large area selected for modeling, it was impossible to input a spatial cluster 
input parameter that would correspond to the initial break. This was due to spatial 
variability of pipe breaks over the entire area, and in this case the model output was a 
summed output. Therefore the model was not designed to determine the exact location of 
a pipe break, and a spatial cluster parameter was not a truly indicative input parameter. 
For example, trying to spatially correlate two separate pipe failures, several kilometres 
apart, (under the assumption that one failure is linked to the other) would be incorrect, 


causing noise and possibly decreasing model accuracy. 


Although a spatial cluster parameter was not thought to be useful, a temporal index 
parameter was created using a one-week lag of breaks within the study area. In this 
manner, it was thought that pipe breaks occurring in the time period immediately 
preceding the current period would be able to adequately represent this phenomenon. 
Therefore, the previous week’s number of breaks was inputted to show temporal 


correlation. It was decided that due to the time frame selected (one week intervals), 
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periods longer than two weeks (the present week interval and the week before) would not 


be physically meaningful for temporal correlation. 


Pipe Integrity 


The pipe integrity parameter was seen as an important indicator of the overall structural 
integrity of the water mains within the subdivision. This measure was calculated by 
summing the total number of pipe breaks that occurred in a one year span previous to the 
input data point in time. It was thought that this input parameter would indirectly infer 
whether these pipes were affected by corrosion and would give an approximate measure 
of the potential degree of this corrosion. This was important since soil parameters 
indicative of corrosion were not available in sufficient detail or quantity to be useful for 
model development. The number of breaks of the same pipe diameter within the 
previous year was accepted as a correlated indicator of pipe integrity, as greater pipe 
integrity would be indicated by fewer breaks within the previous year. Conversely, more 
breaks would indicate a more weakened overall state of pipes within the study area, and 


thus higher pipe failure frequency. 


This parameter was a moving total value, and each input pattern included only the total 
number of breaks within the previous year. Analysis of the trends showed that the total 
number of breaks within the previous year fluctuated (shown in Figure 29), which 
seemed peculiar. It would be expected that the total number of breaks within the 


previous year would continually increase, to reflect the continually worsening degree of 
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corrosion. However, several factors may explain why this was not seen. Variant weather 
patterns, i.e., severe or mild winters may effect yearly break rates. Effective mitigation 
techniques, e.g., cathodic protection, extensive pipe repairs, and pipe replacements may 
be another factor for variation. Also, random and/or exceptional events, those severe 
enough to cause additional failures, may be indicative of an uncharacteristic year. 


Therefore, historical information is intrinsically embedded in this input parameter. 
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Figure 29. Previous year total pipe breaks (moving total), 1985-1991. 


Omission of Potential Input Parameters 


From the above description, it becomes obvious that a number of causal or influencing 
factors (from literature, summarized in Table 2) have been transformed and/or replaced 
with surrogate input parameters. Others are omitted altogether. Those parameters 


selected as potential model input parameters are summarized in Table 4. Lagged values 


74 


RL 


PPR 1s 


Ss 


et 
Py hal 
re Ji 


Vantaa team sree lsh erie ca med 
TASTE SFE 


Ree 
antes ooh yitery } rat» ke 


sy hg e7M ros ig si 
wd ae Pa ier ao 


- aw? aa a he 
ber 
nove lene ‘ven sb mopar: yeaa: klip 


‘sn 7 
to, 4, > : hy , i 
peatowwpitoray 46 76 44 


ie! ore 7 ’ c : 
nor) wast ear bsbbedins “liso bmn Pi ena i 
; any 
ar a be eee -_ — Amcor: —a ee 4 - 
_ ee - Stee a mer te ah at er 
——— me ~ om _ ae 
~ - ~~ —t = a ee i kr VE 
eo Vee eee w oe ta 
- — = a ay - = a Semieies thd Mana 
yee ’ 4 cz: 
+... anda daa 1 seen 
-_ ~~ —T ee ae _— ’ — 
‘ yo ° “ih 
—Sc saieieieneninned ane a £ 
- - > ea ae shea 8 4 ae i A 
ee sayoe:t niga sensei = procera 
‘ - or herve ert p ‘ 
- 4 = Che “~ 
‘ > oe le ae 
ae Sedoeed oat 
~ =a hy tm + 
‘ 7 ; os a 
et sis Saye ron) wierd aqiq tetol vagy euobvot'l..9 suet 
af ; 
, \ i 
. ‘ ie 
orbit teqek lalrastyl to aoleainD 
} - ie ae oA 
ri 9 lezues To seiner © tad? encnvde agenced 1 elgnoash syed ofl mor 
. = 7 y 
us (MARE? owed dart (So 
mm) 320 


det at baybomatis sect) ec [) eucatoa) 

ij 7’ 
YY seebAgoils tontivn oS af ered “ariiacnikg 1gal seyoriay ithiw 
ite nti hry vis MELE TRY ibe ion | aiiee yoG 


£ S15 
of a 


for each component (i.e., the parameter’s value of the week directly previous) are 


indicated if they were considered potential input parameters. 


Table 4. Summary of Potential Input Parameters. 


Factors identified in literature relating to pipe failure mechanisms or indicative of the 
durability of the individual pipes were not included. This was because of the manner in 
which the study area was modeled. The individual pipe’s overall physical characteristics, 
such as pipe diameter and type of pipe joint connections, were not included because of 
the difficulty in transforming these individual pipe characteristics into an overall area 


characteristic (certain areas were installed years later, others had been replaced). This 
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was the primary reason for the scope of the study to be limited to one size of pipe. It was 


also seen as advantageous, as it is easier, and also more accurate to model a single output. 


Environmental parameters affecting the pipe, such as overburden pressure, soil type and 
properties, soil pH, and soil water content were omitted due to lack of monitoring of the 
data. Review of literature showed these parameters to be appropriate measures of 
corrosion and soil-pipeline interaction strengths. However, because the purpose of the 
model was to predict the likelihood of failures for an overall study area, and for 
prediction of cast iron water mains only, the above parameters were presumed adequate. 


Models were developed utilizing the above input parameters. 


3.1.2 Open-Domain Problems 


Modeling of the Calder area pipe failures had difficulties because it is an open-domain 
system. An open-domain is a system that is not controlled, and in which interferences 
such as alternate causes, can neither be identified nor quantified for their overall effect 


within the system. 


As a result of working with an open-domain system, model development using Artificial 
Neural Networks was the best methodology. Nonetheless, within an ANN-type model, 
modeling difficulties were anticipated. Exposure to a full range of data patterns in time 
series was required for the ANN model to make appropriate input-to-output parameter 


associations and, therefore, accurate models. Conversely, a lack of appropriate data, 
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including boundary conditions, is extremely unfavorable for developing accurate 
deterministic-type models. A deterministic model typically requires a controlled, closed- 
domain system. This type of model will derive values based on magnitudes of measures, 
and often not in time series. Therefore, ANN methodology has an advantage in open- 


domain studies since it does not necessarily require boundary conditions. 


3.1.2.1 Time-Frame Selection 


To contend with the open domain nature of pipe breaks, a suitable time frame was 
investigated. To reasonably model the pipe break process, it was essential to ascertain 
this time frame for which the input parameters could correlate to the output parameter 
(pipe breaks). Initial trials with daily data were attempted, but it was found that some 
input parameters, primarily those temperature-related, could not adequately form an 
association with pipe breaks. This could be attributed to the time required for cold air 
temperature effects to propagate into the ground to pipe depth or water. A daily interval 
also did not allow the ANN model to correlate an air temperature drop with a pipe break. 
Model results using an arbitrarily set seven-day interval were reasonable, and greater 
accuracy in prediction was immediately ances Due to the association with the 
seven-day interval (equating to a weekly interval, as opposed to using a four-, five- or 
six-day interval) interpretation of the data in a physical time sense was also more easily 


comprehended. 


Weekly time intervals were implemented in the model by taking seven-day averages, 


seven-day totals, or maximum daily values within the one-week interval, depending on 
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each parameter’s function in the model. Where average values were required, daily input 
data points were calculated using the reference date, and the six days previous to this date 
for averaging purposes. For instance, averaged values from January 1° to January fie 
would be included in the January 7" reference data point. Where seven-day totals were 
required, the same principal was used. For the same example, a total of the daily values 
from January 1“ to January 7" (inclusive) would be included in the J anuary 7" data 
reference point. For maximum values, the maximum daily value within the seven days 


would be selected for modeling. 
3.1.3 Limitations 


As discussed previously, the plausibility of developing a successful model hinges on the 
quality of the data available. It was demonstrated in the literature review and source data 
analysis that the largest limiting factor for modeling ease and accuracy is the presence of 


appropriate and comprehensive data. 


There was a discrepancy between ideal input parameters and the actual data available. 
This discrepancy serves to emphasize the point that in order to facilitate ease and 
accuracy in development of an Artificial Neural Network model, limitations to output 


and input parameters must be overcome. 
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3.1.3.1 Limitations of Output Parameter 


A proposed model output parameter employing an area break density output (number of 
pipe breaks/km of pipe/km’ of ground area) was investigated instead of a linear break 
density. This type of output was thought to be more valuable than a linear break density 
since it would have the ability to define and pinpoint a smaller area for use in 
Edmonton’s water main replacement program. However, information of this type was 
not readily available, and obtaining the parameter in this form required meticulous 
manipulation of spreadsheet data and city maps. Much of the historical data could not be 
found in spreadsheet form, although there is current work ongoing attempting to update 


city databases. Therefore, the linear break density was adopted. 


3.1.3.2 Limitations of Input Parameters 


The availability and reliability of the raw source data created the need for extensive 
analysis and manipulation of the raw data in order to provide input data patterns 
representative of the failure mechanisms. The amount of analysis (and re-analysis 


required) hindered the timely development of the models. 


Because of the open-domain nature of the system, and its effect on the correlation 
between input and output patterns, unpredictable results could be expected. Dealing with 
these problems involved making assumptions. Fair assumptions could be made, 


supported by theory (from literature) or by circumstances (small areas may be assumed to 
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have similar characteristics). However, this will tend to limit the range of application 


unless the availability and reliability problems are mitigated. 


Soil type, an important factor for several failure mechanisms, was not used in the model 
because available soil maps were too general, and indicated the Calder area was 
composed entirely of a silty clay loam, as was a great majority of the Edmonton area. 
Finally, due to a lack of monitoring of most of the soil parameters mentioned, it is 
impossible to include any such information without taking samples from individual areas. 
This process would be time consuming and not economically feasible. For this reason, 


the developed model is limited to areas where uniform soil conditions can be assumed. 


3.2 Modeling Methodology 


The four-step model development methodology for Artificial Neural Network was 
generally followed. As mentioned earlier these four steps are: Source Data Analysis; 
System Priming; System Fine-Tuning, and; Model Evaluation. As will be demonstrated, 
the Source Data Analysis stage has already been defined and performed in the above 
section. System priming has already been partially completed, and was done 
concurrently with system fine-tuning. These middle stages involved employing 
systematic manipulations of data and ANN model structures to arrive at the best models. 


Model Evaluation is the stage where performance criteria are defined and best models 


chosen. 
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In source data analysis, the first objective of this exercise was to understand the problem 
being modeled and to establish the cause-effect and influencing factors as they pertained 
to the output. Having done this through careful research and review of available 
literature, it would be possible to obtain the necessary data. The second purpose of the 
source data analysis stage was to determine the reliability of the data, and to prepare the 
data for input into the ANN model. Given that the study domain modeled was an open 
system, uncertainty of the inputs was unavoidable. Inaccuracies in measurement of input 
parameters (instrumentation accuracy and tolerances may not always be good; many 
parameters are not measured with satisfactory frequency) resulted in “noise” during the 
training of the model, possibly affecting the model’s forecasting capabilities. For this 
reason, it was extremely important to have input parameters that would accurately reflect 


the physical pipe failure mechanisms. 


As illustrated in the above section, there were a number of instances where the 
availability or reliability of the raw input data collected imposed limitations. Therefore 
inputting the most representative patterns and use of suitable ANN model structures 
would be paramount for accurate model development. This was achieved in the system 


priming and system fine-tuning steps. 


In the System Priming stage, the overall objective was to determine which of the 
potential model input parameters would produce the best predictive model. This 
involved inputting patterns which best described the cause-effect and influencing factors. 


Having done this, the next stage involved methodically determining which of these 
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parameters should be included in the model. Again, for open-domain system studies, this 
process becomes more difficult because of the need for confidence in the relevance and 


reliability of the data. 


Due to the use of a number of transformed and surrogate input parameters, it was decided 
that the system priming and system fine-tuning steps would be developed 
interchangeably with one another. This reasoning became evident during the model 
training stage. Early results did not indicate satisfactory prediction capabilities, despite 
use of sound methodologies. Therefore an iterative process was employed between the 
system priming and system fine-tuning stages. The methodologies applied will be further 
described in the following section. Further model refinement and input parameter 
inclusions or exclusions would be decided upon depending on the iterative modeling 


results. 


The System Fine-Tuning stage consisted of meeting three goals. The first, was to 
determine the most appropriate way to present the data sets (training, test, and production 
sets) to the ANN model in order to allow the model to “learn” the data by appropriate 
associations, without having it “memorize” the data. The second goal was to distribute 
events within the data sets such that the full range of occurrences would be captured by 
the model. These first two goals were extremely important to maintain the forecasting 
power of the model. The third goal of system fine-tuning was to develop the model type 
and model structures which would most easily and efficiently model the domain, while 


including features typical for the process. 
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As previously mentioned, the model structure, the distribution of data sets and data 
points, and model input parameters were determined in conjunction with one another. 
Determining the best combination of these elements involved iterative trials, however use 
of analytical modeling methodologies allowed for the most efficient convergence. 


Details of the different methodologies employed are to be discussed in this section. 
3.2.1 Model Progression 


For preliminary modeling, a wide range of model architectures were examined (Standard 
connections; Jump connections; Recurrent network and Ward networks). This was 
performed to determine if any particular network was more advantageous for the inputs 
chosen. The most important potential input parameters were inputted into each model 
type. The models were then evaluated by the R’ statistic (a measure of the model’s 
predictive error when compared to actual output data). This measure was applied to the 
cumulative ANN model predictions. This method allowed for a fair, overall comparison 
of the prediction errors. A secondary evaluation was by visual inspection of actual 
versus predicted results, to surmise whether oe break trends were being matched. 
Based on these results, it was evident that the standard connection backpropagation 
networks were the most suitable network architectures. The standard networks, using 1 
or 2 hidden layers, provided the greatest potential for model development. The standard 


connection network was also desirable for preventing complication of the model 
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development process and to allow for further application of the model, since this model 


architecture type is quite simple in structure. 


Once the general model type was determined, the next task was to determine which 
model input parameters would be most relevant for model accuracy, and in what 
proportions these data sets should be presented to the model. A portion of the data was 
selected for the training and testing sets, and the remainder was used in the production 
set. The predictive properties of the model were then tested on the entire data set (the 
pattern file, consisting of the training, test and production sets). This pattern file was 
arranged in chronological order to maintain the time-series predictive properties of the 


model. 


To determine each input parameter’s significance to the models, a number of methods 
were attempted, using the R’ statistics of the production and pattern files. An addition- 


type method was used. This method involved gradually adding more input parameters, 


starting with five input parameters. A factorial design procedure including all parameters 


was also utilized in order to determine the appropriate input parameters. However, the 


significance of individual parameters was not readily evident through this type of 


analysis since parameter interactions were significant. A substitution-type procedure was 


then employed. This method involved removing a single parameter to determine the 
parameter’s significance. This parameter was then replaced, and a different parameter 


excluded. Results from each method were best using addition- and substitution-type 


models. 
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Different configurations of the standard backpropagation networks were tried 
concurrently with the determination of the important input parameters. Due to the 
interdependence of the input parameters and the type of network model used, a large 
number of trials were attempted. In each case, the model structure was kept constant 
while the determination of the important input parameters proceeded. Once progress was 
made in determining relative importance of an input parameter (depending on the method 
of input parameter determination being performed) a change in the network configuration 
(i.e., number of layers, number of neurons, ratio of neurons in adjacent layers, etc.) was 
performed. This process was iterated, until a large number of models with calculated 
cumulative R’ statistics (on the models’ production sets and pattern files) were 


accumulated. From the above, ten overall potential models were selected. 


From the selected potential models, minor manipulation of the data set proportions and 
data points were performed. The purpose for doing this was to further optimize the 
potential models, and ensure the selected models were exposed to the full range of data 
patterns. These model variations were compared using the R’ statistic applied to the 
models’ pattern files and the two best models chosen for further manipulation. As a final 


step, the activation functions were also varied. Lastly, final model evaluations were 


made. 


Model evaluation is the selection of the developed model that demonstrates the overall 


minimal error in prediction, while correctly predicting the output trends based on the 
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given information. A criterion for this selection was the best combination of R* value 
and visual inspection (to match trends) indicated by slope comparison of actual and 
predicted break values. Consideration of the models’ predictive robustness (sensitivity to 
changes in parameters—evaluation of model logic, and if the model appropriately 


accommodates changes)was also factored. The best models are described in Section 4.0. 
3.2.2 Data Set Manipulation 


When training the Artificial Neural Network model, it must be realized that accurate 
models will only be achieved if the manner in which the data is presented is appropriate. 
Therefore, it is important that a representative amount of the potential data is exposed to 
the model for training and testing of the model’s progress. Without having this 
representative range of historical events to detect and verify, the model will not be able to 


predict events too far outside of the learned cause-effect logic. 


Determination of the optimal amount of historical data required for models’ training, 
testing and production sets was based on a trial-and-error method. For a given model 
(one with established predictive ability), the fraction of data to be inputted into each set 
was varied. Results from the R’ and trend predicting ability were analyzed to determine 
the effect of varying the data set proportions. Too little information in the training and 
testing sets indicated poor results in the production set. This may be likened to poor 
learning of the model. Too much information in the training and testing resulted in a 


comparable decrease in trend prediction and R’ value when applied to the production set. 
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This may be likened to the model memorizing only the information presented to the 
model, and then performing poorly when attempting to apply itself to new data patterns 


with slight differences. 


Division of input data into training, testing and production sets was performed on a 
number of different selection criteria in order to determine if any of these methods was 


more time efficient in terms of training time. These included: 


1. Random selection; 

2. Frequent Interval selection; 

3. Yearly interval selection; 

4. Specific data point selection (based on sorted output results); 

5. Specific data point selection (based on sorted output and grouped input division); and 


6. Event selection/elimination (based on model results). 


Random selection refers to a semi-random division performed by the Neuroshell 2 
program. This is considered semi-random since the program requires the input of a 
“seed” value, in order to divide the pattern file.into the three data sets (test, training and 
production). Input of an identical seed value results in identical division of the same file. 
Frequent interval selection acts very similarly, except the modeler specifies the 

_ frequency of selection of a pattern into the three data sets. These selection criteria are the 


most automatic with little adjustment or judgement from the modeler. 
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The remainder of the selection criteria required use of judgement by the researcher. The 
yearly interval selection criteria was based on the idea that the data must be presented to 
the model in time series, since the output was determined in time series. Specific data 
point selection, where individual data patterns, or small groups of data patterns (up to 
five patterns) were selected for inclusion into one of the three data sets. Selection of data 
patterns involved maximizing the exposure of the output and inputs in all data ranges, 
based on mechanical sorting of the values of the output parameters alone, or mechanical 
sorting of the data patterns in the input and output parameters at once. Event 
selection/elimination employed the GRNN models to determine how an individual data 


pattern effects the prediction accuracy. This is described below. 
3.2.3. Data Point Manipulation: Selection/Elimination Protocol 


By training the models using this method of data set division, the methodology involved 
evaluating individual data patterns, or a small set of data patterns (five or less) in the 
pattern file. A decision was then made whether the individual data pattern should be 
included in the training, testing or production et This decision-making process 


employed the following protocol: 
1. R? model improvement; 


2. False prediction of non-break events (phantom peaks) and; 


3. Prediction of break events 
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The GRNN model was used because of the learning architecture’s ability to memorize 


input patterns when presented with sparse data sets (Ward Systems Group, 1993). 


The reasoning for this decision-making protocol was to employ a supplementary teaching 
method to improve the models’ exposure to specific events. It allowed the ANN model 
to be exposed to an event and acquire the appropriate correlation to the output based on 
recognition of the specific input pattern. In essence, the model had to be tutored, as only 
part of the logic of the problem was being originally learned. Therefore, the above 
learning method allowed the model to become accustomed to learning particular data 


patterns. 


The following two figures demonstrate the effect of moving five data patterns from the 


pattern file into the training file. In this example, the model was initially trained using 


only break events through a GRNN architecture (see results in Figure 30). 
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Figure 30. Event prediction before training with non-break data. 


Subsequently, five data patterns were selected from the pattern file, and distributed to the 
training and testing sets. The result of this manipulation on model results is illustrated in 


Figure 31. 
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Figure 31. Event prediction after ‘“‘point A”’ data point addition 


From these results, it is demonstrated that there is a potential use of this data selection 
technique. While this technique requires judgement, and time to evaluate the results, it 
has useful applications for studies such as this, when studies are of open-domain systems 


and data is limited. 


3.2.4 Input Parameter Selection: Methods Tried, Results 


Three distinct methodologies were employed to determine the appropriateness of the 
potential input parameters for the pipe break study. Initially, a methodology was applied 
which involved gradually increasing the number of potential input parameters, and then 
determining the effect on the ANN model. A factorial design procedure was also 
attempted, to take into consideration the significance of the interaction between the input 
parameters. Thirdly, substitution of individual input parameters was attempted, to 
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determine the effect of the input’s absence on the model. Evaluation of the different 


arameters was based on the R’ statistic of the model. 
p 


3.2.4.1 Inputs Addition 


Initial neural network models were configured by varying a small number of parameters 
that were thought to be the more important parameters. The input parameters which were 
initially investigated were: 7-day average temperature; 7-day temperature change; 7-day 
average water temperature; (air-water) temperature differential; and total pipe breaks for 
one year previous. It was anticipated that significant changes in modeling accuracy could 
be observed in order to predict major trends. A second goal was to determine which 
form of the input parameter (specifically air temperature) was most appropriate for 


modeling accuracy. 


This method involved inputting initial models with only the parameters that were thought 
to be most important, as per cited literature. By beginning with only the bare minimum 
number of input parameters, and then gradually adding more potentially input 
parameters, the models being developed would gradually increase in R’ value. The 
addition of those input parameters that provided marginal or no improvement were not 


‘significant to modeling the study and, therefore, were excluded 
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3.2.4.2 Factorial Design 


Factorial design analysis of the potential input parameters involved selecting a fixed 
network followed by varying the parameters that were likely to be redundant or having 
significant interaction effects. Evaluation of this method involved factorial analysis of 
the production and pattern file R? values. The R’ of the production file was used to 
determine the model’s learning ability of the patterns, while the R” for the pattern file 


was used in order to maintain time-series prediction. 


Care was taken so that a small enough range for each input parameter was chosen. 
Because of the inherent non-linearity of the network configuration, the effects of varying 
a single factor was also non-linear. Therefore, if due care was not taken, it was possible 
that a change in the effect would not be noticed. Also, because of the non-linear 
behavior, it was necessary that either the number of input parameters or type of input 
parameter could vary depending on the type of model chosen (e.g. Recurrent Networks 
compared to Standard Networks). Due to this non-linear variability, an iteration process 
(checking fit of various model types against rerine input parameters) was required, 


creating the need to perform several hundred model runs. 


3.2.4.3 Substitution-Elimination 


This process involved removing and replacing a single potential input parameter, and 


then evaluating it’s effects on the R? statistical value. A lower value of R’ indicated 
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significance of the parameter. A higher R’ statistic indicated no significance or noise in 


the parameter. This process may be considered a trial-and-error type experimentation. 
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4.0 RESULTS 


4.1 Evaluation Criteria 


Evaluation of the potential models involved a combination of three factors: R* model fit 
statistics, trend prediction ability (slope matching) and simplicity (model architecture 
complexity, measured by the number of hidden layers, and total number of neurons). 


Reasons for this multiple criteria were that the best model must have the ability to: 


1. Predict the events with accuracy (R? value); 
2. To have a strong trend prediction (the rate at which breaks would occur, at any 
particular time frame) and; 


3. To be simple enough to be used for further modeling purposes. 


Based on these criteria, the best models were chosen. Evaluation was based upon a 
combination of quantitative measures and good judgement, since only the first criterion is 
completely quantitative. Trend prediction is subjective, since slope changes are so rapid 
and frequent, and visual slope matching (of actual versus predicted trends) for particular 
time periods is most important. Simplicity of the model is also subjective, and depends 


on the expertise of the modeler. 
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4.1.1  R’Statistic 


Calculation of the R* value for the cumulative model was performed using the following 


formula (Equation 13): 


Where: 


y aca i vod 1. 


> 


y pire dy ent endiey ranliuie 


y Mm ean voa lueéwo tf vy 


[13] 


This was calculated using the applied pattern file since cumulative breaks is not actually 


calculated in the actual ANN model. Instead, EXCEL was used to sum up breaks in the 


time-series, and then used (according to the above equation) to calculate the R? value. 


The R’ statistic depicted in model results is applied to the entire data set (6 /2 year of 


data). This statistic indicates an overall accuracy of the model, such that the probability 


of events occurring is predicted well on a regular basis. 
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4.1.2 Trends Prediction 


Trends prediction ability was based on matching the slope of the actual results and model 
prediction results. Overlapping or parallel lines indicated good forecasting ability since 
break rates were matched. Gaps between the two lines are not indicative of trend 
predictive ability since any errors in prediction are accumulated throughout the length of 


time the model is being evaluated. These gaps are indicated in the R* model fit statistic. 


4.1.3 Model Simplicity 


The importance of simplicity of the model cannot be underestimated. Maintenance of a 
simple model is paramount for reproducibility of results as well as implementation of the 
model. Without this simplicity, future models may become convoluted with unnecessary 


parameters and the importance of these parameters will not be understood. 

4.2 Best Models (Cumulative Results) 

Determination of the best model was based primarily on the models’ cumulative R? 
values. This measure gives a good overall indication of the models’ predictive 


capabilities by showing overall errors instead of focusing on single-event accuracy. Ten 


prospective models were selected based on the R’ criteria. These models were further 
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evaluated on their trend-predictive ability, with consideration given to the simplicity of 


the models for future application. 


The two best prospective models are shown below. Model A demonstrated excellent 
predictive ability employing only 5 parameters. Graphical results are illustrated in Figure 
32, with model properties summarized in Table 5. Model B demonstrated even better 
trend-predictive accuracy with 13 parameters. This model’s results are shown in Figure 


33 and summarized in Table 6. 
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Figure 32. Prospective model A. 
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Table 5. Prospective model A: model specifics. 


Network Architecture Configuration Activation Function 


0.920 Standard 5-25-6-1 Linear [-1,1] (scaling) 


Backpropagation Logistic (all other) 


cumulative breaks 


Actual 


Figure 33. Prospective model B. 


Table 6. Prospective Model B: model specifics. 


Network Architecture Configuration Activation Function 


0.986 Standard 13-39-1 Linear [-1,1] (scaling) 


Backpropagation 


Logistic (all others) 
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The graphical determination of trend-prediction ability (as defined by the matching of 
trend slopes) is an important consideration for any model. It would be impossible to 
predict with much greater accuracy since there are limitations in the availability of 
important input parameters, and since the system is an open domain. Since the actual and 
model trends are cumulative, it should be noted that larger errors between the actual and 
model predictions are also cumulative, thus explaining areas of relatively large 
discrepancy. But because the slopes during these periods are close to identical, they do 


represent a very good event-to-event predictive ability. 


4.3 Event Prediction Models 


Results for prediction of single events were not as favorable as anticipated. This was due 
to the step function of the breaks. With real data outputs which are stepped integrals, the 
ability of the model to predict a whole-number using a continuous value output proved to 
be exceedingly difficult, no matter which method of event selection criteria was chosen. 
The models, when presented this information, would assign a continuous value 
probability to the output. Use of a threshold output model would have been 
advantageous, however models provided in the NeuroShell 2 program limited the output 
step to either 0 or 1, preventing the prediction of a multiple break, which is not the 
purpose of this study. However, one of the typical models shown in Figure 34 


demonstrates that the model shows potential for future development. 
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Figure 34. Typical event prediction results. 


The models’ single-event prediction capabilities were not well forecasted, using the R? 
criteria. Evaluation of these models using graphical analysis proved that the models 
could reasonably predict the probability of a break event, however the severity of the 
event was not necessarily matched. Given the purpose and level of research into this 
matter, initial results are reasonable. However, it is recommended that more research be 


conducted and more inputs be investigated. 
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5.0 DISCUSSION 


5.1. Sensitivity Analysis 


A sensitivity analysis was conducted on the chosen models to determine the robustness in 
the prediction of events that had not been previously presented to the model. The 
purpose of the sensitivity analysis was: to determine the model’s actual learning ability; 
to determine the complexity of the models’ learning pathways; and to confirm that the 
model extracted cause-effect logic underlying pipe failures, rather than pure 
memorization of the data. The following parameters were cumulatively tested for their 


robustness: 


e 7-day average temperature (Model A and B); 

e 7-day average temperature, lagged 1 week (Model A and B); 

e 7 day water temperature (Model A and B); 

e 7-day average (air-water) temperature differential (Model B only); 

e 7-day average (air-water) temperature differential, lagged 1 week (Model A and B),; 
e 1-year previous historical break, moving total (Model A and B); 

e 1-week previous historical break, moving total (Model B only); 

e maximum 7-day temperature change (Model B only); 

e maximum absolute 7-day temperature change (Model B only); 

e maximum daily temperature change ; (Model B only) 


e¢ maximum daily temperature change, lagged 1 week (Model B only); 
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e maximum absolute daily temperature change (Model B only); and 


e maximum absolute temperature change, lagged 1 week (Model B only). 


Input parameters were varied in isolation from other independent input variables so that 
their effects on the models’ output parameter could be identified for causality. Input 
parameters related to other input parameters were varied concurrently (e.g. 7-day average 
temperature and 7-day average temperature lagged 1 week were varied simultaneously) 
to maintain consistency of logic. Similarly, parameters related to the variable of interest 
were also adjusted, since varying only one parameter when it is related directly (or 
indirectly) to other variables would invariably “confuse” the model. For example, all 
temperature parameters (7-day average temperatures, 7-day maximum temperatures, etc.) 
were varied concurrently since they depend on the same raw air temperature data. Most 
of the input parameters were adjusted to 30 percent less than and 30 percent greater than 
the models’ original inputted values. However, input parameters based on the number of 
breaks (previous week and previous year moving totals) were subtracted from the totals. 


Results of the sensitivity analysis are presented from Figure 35 through Figure 43. 


Model A 


Sensitivity analysis of air temperature parameters (varying 7-day average temperature 
and 7-day average temperature lagged one week) showed that a 30 percent increase in air 
temperature (both positive and negative magnitudes) resulted in a predicted percent 


increase of 25 percent from the actual values. A 30 percent decrease in air temperature 
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resulted in a decrease in cumulative pipe breaks of 19 percent, at the end of the 6 % year 
period. Slope changes were more pronounced during winter events for the 30 percent 
increases in air temperatures, conversely they were less pronounced for the 30 percent 


decreases. 


The results from this analysis are logical. A percent increase in temperatures translates to 
increase in the magnitudes (i.e. the range of values is increased by the corresponding 
percent). It also translates to an increase in daily and weekly differences. This serves to 
magnify changes in temperature. Changes in temperature are significant for soil-pipeline 
interactions and frost heave action, which supports the previously discussed theories. 
Therefore the model tends to show a cause-effect relationship between temperature 


changes and pipe breaks. 
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Figure 35. Model A sensitivity analysis: 7-day average air temperature. 
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Graphical analysis of the sensitivity of water temperatures for Model A (Figure 36) 
shows that a 30 percent decrease in water temperature results in a 44 percent increase in 
water main failures. This representation assumes that the air temperature range does not 
change during this manipulation, so that a smaller change in water temperature is not 
accompanied by a proportional change in air temperatures. This creates a differential 
temperature across the wall of the pipe (assuming air temperature gives a reasonable 
reflection of ground temperature). The resulting hoop stresses may result in longitudinal 


split and diagonal failures. 


Analysis also shows that a 30 percent increase in water temperatures results in only a 2 
percent increase in pipe failures. Possibly, the larger ranges in water temperatures 
correspond to the range of the air temperatures, creating a less significant pipe wall 
temperature differential. This assumption is logical since the same is assumed for the 30 
percent decrease in water temperature. As mentioned, the 30 percent decrease would 
result in a smaller range of water temperature fluctuation. These smaller temperature 
changes (relative to the air temperature changes) again support the theory of a larger 
temperature differential. Another plausible reason for the reactions of this model to 
water temperature variations, is that Model A places more importance on the 


determination of pipe breaks (A reminder is that the model contains only 5 parameters). 
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Figure 36. Model A sensitivity analysis: 7-day average water temperature. 


Output effects of a 30 percent increase in pipe wall temperature differentials followed 
logical interpretations (Figure 37). As expected, a 30 percent increase in temperatures 
resulted in larger cumulative pipe breaks (a 26 percent increase from the actual value). 
This may be due to the exaggerated maximum and minimum temperatures, or it may be 
due to exaggerated temperature changes. Such changes would logically accelerate pipe 
failure mechanisms such as frost heave (having increased the rate of freezing) and soil- 
pipe interactions (long heated periods followed by cooler temperatures could indicate 
precipitation events, and therefore periods of soil instability). Conversely, a 30 percent 
temperature decrease would understate temperature ranges and changes, and thus indicate 


weaker pipe failure mechanics. 
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Figure 37. Model A sensitivity analysis: Pipe-wall temperature differential. 


A lowered 1-year historical break total indicates that the ensuing break trend also 
decreases, as demonstrated by 16 percent drop in pipe failures. This is possibly due to 
greater structural integrity of the pipe system, a possible contributing factor being pipe 
wall corrosion. Results from a greater 1-year historical break history show a much more 
significant rise in the number of cumulative pipe breaks (49 percent increase). This 
model forecast is logical, indicating that structural integrity of the piping is an extremely 


important factor in pipe breaks. 
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Figure 38. Model A sensitivity analysis: Historical 1-year break frequency. 


Model B 


A sensitivity analysis was also performed for Model B. Temperature was again 
analyzed, but more air temperature inputs are included with this model (Figure 39). Like 
Model A, Model B indicates that larger fluctuations in air temperature ranges and 
magnitudes result in extremely significant percent increases in pipe breaks. In fact, a 70 
percent increase in pipe breaks is predicted by a 30 percent increase in air temperatures 
(over the 6 % years). Clearly, this model places great importance on the various air 
temperature parameters to indicate pipe failure mechanisms. Lowered ranges (30 percent 


decrease in air temperature range and magnitude) result in lesser cumulative pipe breaks 
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(13 percent decrease). As in Model A, this qualitatively supports the idea that frost heave 


and soil-pipeline interactions are major factors in water main failures. 
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Figure 39. Model B sensitivity analysis: Air temperature. 


Comparison of sensitivity analysis for water temperature (Figure 36 and Figure 40) 
shows that Model B does not place as much importance on water temperature. A 9 
percent increase and 2 percent decrease are found for 30 percent increases and decreases, 
respectively, in water temperature. This does not agree qualitatively with the sensitivity 
analysis of Model A. It is apparent that Model B places less significance on the water 


temperature parameter. 
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Figure 40. Model B sensitivity analysis: 7-day water temperature. 


Results from the sensitivity analysis of pipe wall temperature differentials were 
extremely significant (see Figure 41). A 30 percent increase in the temperature 
differential results in a remarkable 91 percent increase in pipe break totals. A 30 percent 
decrease results in a 6 percent increase in pipe failtnies: For this, the trend of the breaks 
(steep slopes) indicates rapid numbers of failures during rapid temperature changes. One 


can infer that pipe wall temperature gradients are an important cause of pipe failures. 


110 


es 
sas 
oe Ay, 


ai ieee agi “in at 


[va 


SB 


7) 
= 
SS 
is) 
coal 
— 
vo 
> 
= 
io) 
=i 
| 
= 
= 
Oo 


mw 


AN OO 
KOA 
4 AN 


time (veeks) 


cocoate (),7 (Qir-Wat.) 


Figure 41. Model B sensitivity analysis: Pipe-wall temperature differential. 


The previous week’s pipe breaks (spatial cluster index) was formulated to try to mimic 
the phenomena of multiple pipe breaks. As literature indicates, results show that 
increases in breaks for the week previous increased the likelihood of pipe breaks in the 
present period (refer to Figure 42). In the same reasoning, no breaks in the week 
previous decrease the chances of subsequent multiple pipe failures occurring due to the 
‘spatial clustering. It may be inferred that multiple pipe failures for this area would 


decrease if the previous week had minimized pipe failures. 
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Figure 42. Model B sensitivity analysis: Spatial cluster index. 


Finally, as in Model A, it was found that the previous year’s total breaks was a good 
indication of the potential for future pipe failures occurring. A 7-break increase results in 
a 65 percent increase in pipe breaks. A 7-break decrease results in only a 6 percent 
decrease in water main failures. This large discrepancy can be explained in that pipe 
failures will occur inevitably, in large part due to temperature and corrosion effects. An 
uncharacteristically higher break total indicates that pipes are in a weakened state. Lower 
break totals indicate pipes are typically stronger, or perhaps it is indicative of work 
performed on the pipes (i.e., possible mitigation by cathodic protection, pipe section 
repair or replacement). The historical break total gives an indirect indication of the 


pipe’s integrity, possibly inferring corrosion influences. 
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Figure 43. Model B sensitivity analysis: Historical 1-year break frequency. 


5.2. Apparent Influential and Causal Factors 


From the sensitivity analysis, there is evidence that the learned, intrinsic logic underlying 
the models developed with the ANN methodology is consistent. This is demonstrated by 
the relatively consistent trends predicted by the models, and by the sensitivity analyses’ 
apparent support of the relationships between the input parameters and the pipe failure 
mechanisms. Effects of frost heave, soil-pipeline interaction and pipe wall temperature 
differentials are strongly related to the inputs presented. Inferences can be made from the 
1-year previous historical breaks, spatial cluster index, and possibly for water 
temperature parameters. 
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Frost heave mechanisms are related to the rate of frost penetration. As no ground 
temperature data could be found, air temperature was used to characterize this 
phenomenon. As discussed earlier, this is a relatively good approximation if the 
information is given in time-series. By presenting the temperatures in time-order, the air 
temperature can indicate a rate of change, which the model can then intrinsically relate to 
ground temperature. A larger change in temperature and larger, negative magnitudes 
indicates that frost penetration rates will be higher. This leads to greater frost heaving, 
and therefore greater stresses applied to the pipe. Logically then, more pipe breaks 
should occur. Both models sensitivity analysis shows this (Figure 35 and Figure 39) to 
be the case. Conversely, smaller temperature changes and smaller temperature 


magnitudes have indicated less breaks. 


Soil-pipeline interactions are caused by sudden drops in temperature. This type of 
behavior is consistent with the sensitivity analyses performed. Magnifying the air 
temperatures by a factor of 1.3 causes differences in temperature to be 1.3 times greater 
as well. This also translates to increases in both negative and positive magnitudes, and 
also magnification of changes in temperature. With this reasoning in mind, these 
magnifications should also result in more pipe failures. This has been demonstrated. In 
contrast, taking 0.7 (70 percent) of the values results in lower temperature changes and 
relatively smaller magnitudes. This results in less pipe failures, which is also illustrated 


in both models’ analyses. 
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For these models, the pipe wall temperature gradients were depicted by the difference of 
the average air and average water temperatures. Air temperature is indicative of the 
ground temperature (which contacts the exterior of the pipe) and water temperature is 
indicative of the interior of the pipe. Differences in temperature result in higher hoop 
stresses exerted on the pipe. Increasing temperature differentials should increase 
stresses, resulting in more water main failures. Decreasing stresses should translate to 
lower stress levels, therefore, less failures. Both model A and model B sensitivity 


analyses (Figure 37 and Figure 42) reflect these effects on pipe break totals. 


The historical 1-year previous break frequency indicates the study area’s past year break 
history. It gives an indication of the stability of the system, therefore indicating the 
potential for future breaks. If less breaks occur in the previous year, this may be 
indicative of a sturdy pipe infrastructure, made durable by mitigative actions (i.e., pipe 
replacements, pipe repairs, or cathodic protection). More breaks may be indicative of 
increasing corrosion problems, or other activities that have caused pipe instability. 
Therefore, increasing the past year’s break total indicates more instability. Results from 


both models indicate these trends are consistent (Figure 38 and Figure 43). 


Analyses of the spatial cluster index (1-week previous pipe break total) indicates a logic 
similar to that of the historical 1-year break frequency. Decreasing the number of breaks 
to zero breaks indicates lower breaks in the upcoming week. Increasing the number of 
breaks by three breaks shows an increase in break frequency in the upcoming week. This 


clustering phenomenon was attributed to disturbances in the surrounding soils, causing 
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further instability and settlement. This instability increases the likelihood of pipe 
failures. Although this relationship can only be inferred, it seems to be a plausible 


justification. 


Results from the sensitivity analysis of the water temperature parameter do not yield 
completely consistent results. While both models (Figure 36 and Figure 40) demonstrate 
that increased water temperatures (ranges and magnitudes) show negligible effects on 
pipe breaks, decreases in water temperatures show a significant increase in pipe breaks 
for Model A only. For Model B, significance of lowered water temperature ranges and 


magnitudes is also negligible. 


In order to rationalize the logic of the model, it must be accounted that water 
temperatures will always be positive values (since water freezes at 0°C). Magnifying 
values by a factor of 1.3 depicts warmer water temperatures. Magnifying by a factor of 
0.7 results in a smaller temperature range, and cooler year-round water temperatures. 
Therefore, values magnified by a factor of 1.3 are interpreted as warmer temperatures 
year-round, which is less conducive to pipe breaks. Values magnified by a factor of 0.7 
translate to cooler water temperatures year-round, which is more conducive to pipe 
failures. Since Model A has only 5 parameters, compared to Model B’s 13 parameters, 
‘Model A must infer more information from less parameters, which may be somewhat 
simplistic. The fact Model B has more than double the inputs of Model A also explains 


why the percent change of cumulative pipe breaks (for each common input parameter) is 
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different. However, qualitative analysis is in agreement, which is the primary concern 


for this modeling study. 


From the models sensitivity analyses presented above, one can infer significant findings. 
The models analyses support the literature citing the nature of pipe break mechanics, and 
they generally adhere to the described failure mechanisms. Therefore, significance of the 


different input parameters is generally understood. 


5.3. Model Capabilities, Limitations, and Uses 


Quantitative and qualitative analysis of the two models indicate evidence that the 
Artificial Neural Networks methodology is capable of predicting water main failures. 
The R’ statistics and trend predicting abilities of the models indicate the potential for 
developing accurate, forecast-capable models. The sensitivity analysis shows that 
intrinsic logic of the various failure mechanisms is credibly captured. Manipulation of 
the model input parameters also allows for inferences to the prevailing failure mechanism 
tendencies of the study area. Having done this, it is possible to implement the 


appropriate mitigative techniques. 


Due to assumptions made during source data analysis and data collection, these models 
are most applicable to specific situations that are directly related to the reliability and 
availability of raw source data. The assumption that the area was composed of a uniform 


soil type does not allow for this model to be applied to non-uniform soils, nor is it 
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necessarily applicable to different soil types. Since modeling was restricted to the 150 
mm diameter cast iron pipe, modeling of larger-size pipes, or different pipe materials 
(i.e., asbestos cement, PVC, etc.) may require extensive modification of input parameters 
because different failure mechanisms will dominate. For the same reasons as the above 
assumptions, application to warmer climates may warrant use of different type of 


parameters (to reflect the dominance of failure mechanisms for warmer climates). 


Further to this, these models are limited by the open system nature of the study. The 
model will not predict random events that cause pipe breaks such as water hammer 
events and severe weather events. However, it may be possible to determine what 
percentage of all pipe breaks will be of a random nature, and use this information as a 


prediction tolerance. Further study is required to perform this. 


These models are dependent on the historical information of the study area, and therefore 
is specific to the area studied. These models must be retrained for other subdivisions. 
The model results presented do favorably indicate the application of the model for the 
study of pipe breaks. However, the models themselves cannot be viewed as applicable 


tools to the entire City of Edmonton’s cast iron water distribution system. 


The models developed illustrate the utility of using Artificial Neural Network 
methodology for predicting pipe breaks. While the model is limited in application for the 
above reasons, these models, along with newly developed ANN models, have valuable 


uses. Development of ANN pipe break models, for different subdivisions within 
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Edmonton, will make it possible to prioritize areas for the Cast Iron Renewal Program. 
The model itself may be useful as a monitoring tool, to evaluate the progress of past 
water main renewal efforts. As mentioned earlier, manipulation of model input 
parameters would allow inferences as to pipe failure causes, and to allow for appropriate 


mitigation techniques. 


Potential use of the model for pipe break prediction must also include a methodology to 
forecast the input parameters, namely air and water temperatures. The models use a full 
array of accessible information to accurately forecast probabilities of pipe breaks in time 
series. Therefore, input parameters should also be able to indicate severe or abnormal 
conditions. This will permit for more accurate forecasts of pipe breaks. Once this can be 


accomplished, implementation of the model for practical purposes becomes possible. 
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6.0 CONCLUSIONS AND RECOMMENDATIONS 


With respect to developing a prediction-capable pipe break model for a given city 
subdivision, the goal has been achieved. Utilizing readily available information, the 
Artificial Neural Network methodology has proven its feasibility in this regard. The 
forecasting ability of this model has been demonstrated using the Calder subdivision as a 
model study. Quantitative analysis using R’ statistics and visual examination of trends 
(slope matching) has permitted appropriate selection of models based on accuracy and 
trends matching capability. Qualitative analysis, in the form of sensitivity analysis, was 
performed to demonstrate the ANN models’ ability to “learn” the intrinsic logic 
underlying pipe failure mechanisms. Through the careful model development 
methodology and evaluation, coupled with the sensitivity analysis, it is shown that 
concepts are being extracted from the input parameters, rather than the model purely 


“memorizing” the inputted data in order to predict the output. 


The ANN models also demonstrate their potential applicability as screening tools. The 
developed models were able to accurately predict the cumulative number of pipe failures 
for the six and a half year study time period. Application to other city subdivisions 
would offer comparative information for priority setting for the Cast Iron Renewal 


Program. 


Manipulation of input parameters in developed models permit its use for inferring which 


of the cast iron pipe failure mechanics dominate. From the sensitivity analysis results, it 
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is clear that frost heave, soil-pipeline interaction and pipe wall temperature gradients are 
responsible for a large portion of the 150 mm cast iron water main failures. Therefore, 
air temperature plays a very significant role in the model prediction. Other parameters 


show significance, but further investigation is required to make quantifiable conclusions. 


Since the developed models do not include a number of specific measures thought to be 
important to pipe failures, the models developed in this exercise are not complete. While 
they do demonstrate the utility of using Artificial Neural Networks for predicting pipe 
breaks, further work for data collection and model development is required to ensure the 
model is flexible for future applications. From the perspective of frost action, 
considerations may include further examination of the freezing index as potential model 
input parameter. This index would provide a generalization of the severity of a winter 
event. Another possibility is to examine the effect of pipe-trench backfill material and 
ground surface (e.g., asphalt, clay cover, gravel cover, or other). Both backfill materials 
and ground surface will change the thermal exchange of heat from ground to air, thus 


varying the effect of temperature transitions on pipe breaks. 
ying p pip 


Having made the above conclusions, it is clear that more work is required to facilitate 
future use. However, the models presented have useful applications for the Calder area. 
Given that both Models A and B demonstrated exceptional accuracy in prediction, these 
models may be used to diagnose existing problems in the area. These ANN models are 
capable of predicting the frequency of pipe failures caused by frost heave, soil-pipeline 


interaction and pipe-wall temperature gradients. If there is an uncharacteristically-large 
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discrepancy between the actual number of pipe breaks and the number of pipe breaks 
predicted by the ANN models, it is apparent that there are other failure mechanisms 
contributing to pipe breaks in the area. This may include corrosion problems or 
operating pressure-related problems (i.e., water hammer events or pumping problems), or 
circumstance requiring detailed investigations. In any case, the model will be a useful 


information tool for diagnosing this possibility. 


This model study illustrates the need for the following actions, to facilitate ease, and 
more comprehensive development of Artificial Neural Network models for water main 


failure prediction: 


1. Inclusion of more descriptive data; 
2. Collection technique improvements of present data; 
3. Characterization of temperature data and; 


4. Exploration of input parameter importance, and special phenomena. 


To further develop ANN models that are accurate and flexible, inclusion of more 
descriptive data is needed. Initial models required making assumptions that were scope- 
limiting since it required constant values. The availability of detailed soils parameters, 
physical pipe characteristics, and in-situ pipe conditions would be assets. Soils 
parameters and physical pipe characteristics would indicate more explicitly the 
characteristics of the cast iron pipe. In-situ pipe conditions, possibly collected from 


hydroscope measurements, may also be of value. Overall, the goal of inclusion of these 
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parameters would be to widen the scope of application of the models, instead of limiting 


areas of application. 


Source data collection in this study demonstrated a need for more complete and accurate 
data. Much of the data used for this study was available only in hardcopy and it was 
difficult to obtain, or was of insufficient detail or quantity. As a large amount of quality 
data is required for ANN application, it is recommended that more complete databases 


are kept, and this information is updated, to facilitate ease of collection of raw data. 


Due to the importance of air temperature to the ANN models developed, it is 
recommended that weather data be characterized such that typical years, above- and 
below-average temperatures and other special events be characterized, and therefore used 
for pipe break sensitivity analysis. Inclusion of a freezing index may be a possible 
avenue for characterizing weather. The sensitivity analysis tool then potentially becomes 


more valuable when pipe break rates between areas are similar. 


To accurately quantify the effect of certain input parameters for a given area, it will be 
necessary to develop the ANN models, using the above study as an initial starting base. 
As demonstrated in this study, the feasibility of Artificial Neural Networks methodology 
was proven, and is effective as a diagnosis tool. However, subsequent models with more 
descriptive parameters will enhance the understanding of the effects of individual causal 


or influencing input parameter on cast iron water main failures. 
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APPENDIX A. Sample Model Input Data, Model A. 
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